Top Qs
Timeline
Chat
Perspective

Azure Data Lake

Cloud-based data storage and analytics service From Wikipedia, the free encyclopedia

Remove ads

Azure Data Lake[1] is a scalable data storage and analytics service. The service is hosted in Azure, Microsoft's public cloud.

Quick facts Developer, Initial release ...
Remove ads

History

Azure Data Lake service was released on November 16, 2016. It is based on COSMOS,[2] which is used to store and process data for applications such as Azure, AdCenter, Bing, MSN, Skype and Windows Live. COSMOS features a SQL-like query engine called SCOPE upon which U-SQL was built.[2]

Storage

Data Lake Storage is a cloud service to store structured, semi-structured or unstructured data produced from applications including social networks, relational data, sensors, videos, web apps, mobile or desktop devices. A single account can store trillions[3] of files where a single file can be greater than a petabyte in size.

Analytics

Data Lake Analytics is a parallel on-demand job service. The parallel processing system is based on Microsoft Dryad.[4] Dryad can represent arbitrary Directed Acyclic Graphs (DAGs) of computation. Data Lake Analytics provides a distributed infrastructure that can dynamically allocate resources so that customers pay for only the services they use. The system uses Apache YARN, the part of Apache Hadoop which governs resource management across clusters. Data Lake Store supports any application that uses the Hadoop Distributed File System (HDFS) interface.[4]

U-SQL

U-SQL is a query language for Data Lake Analytics parallel data transformation and processing programs. It combines SQL and C#: it is and an evolution of the declarative SQL language with native extensibility through user code written in C#. U-SQL uses C# data types and the C# expression language.

Retirement

In 2021, Microsoft announced the 2024 retirement of the original Azure Data Lake Storage, now called "Gen1". The related Azure Data Lake Analytics / U-SQL technologies are also being retired.[5] Azure Data Lake Storage Gen2, an extension of Azure Storage, will continue.[6] The suggested replacement technologies are Azure Synapse Analytics and Apache Spark.[7]

See also

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads