We embarked on the Drill project in late 2012 with two primary objectives:
- Enable agility by getting rid of all the traditional overhead - namely, the need to load data, create and maintain schemas, transform data, etc. We wanted to develop a system that would support the speed and agility at which modern organizations want (or need) to operate in this era.
- Unlock the data housed in non-relational datastores like NoSQL, Hadoop and cloud storage, making it available not only to developers, but also business users, analysts, data scientists and anyone else who can write a SQL query or use a BI tool. Non-relational datastores are capturing an increasing share of the world’s data, and it’s incredibly hard to explore and analyze this data.
Today we’re happy to announce the availability of the production-ready Drill 1.0 release. This release addresses 228 JIRAs on top of the 0.9 release earlier this month. Highlights include:
- Substantial improvements in stability, memory handling and performance
- Improvements in Drill CLI experience with addition of convenience shortcuts and improved colors/alignment
- Substantial additions to documentation including coverage of troubleshooting, performance tuning and many additions to the SQL reference
- Enhancements in join planning to facilitate high speed planning of large and complicated joins
- Add support for new context functions including
CURRENT_USER
andCURRENT_SCHEMA
- Ability to treat all numbers as approximate decimals when reading JSON
- Enhancements in Drill’s text and CSV handling to support first row skipping, configurable field/line delimiters and configurable quoting
- Improved JDBC compatibility (and tracing proxy for easy debugging).
- Ability to do JDBC connections with direct urls (avoiding ZooKeeper)
- Automatic selection of spooling or back-pressure exchange semantics to avoid distributed deadlocks in complex sort-heavy queries
- Improvements in query profile reporting
- Addition of
ILIKE(VARCHAR, PATTERN)
andSUBSTR(VARCHAR, REGEX)
functions
We would not have been able to reach this milestone without the tremendous effort by all the committers and contributors, and we would like to congratulate the entire community on achieving this milestone. While 1.0 is an exciting milestone, it’s really just the beginning of the journey. We’ll release 1.1 next month, and continue with our 4-6 week release cycle, so you can count on many additional enhancements over the coming months.
Also be sure to check out the Apache Software Foundation’s press release.
Happy Drilling!
Tomer Shiran and Jacques Nadeau