Druid 0.8.2 RC1

We’re happy to announce the release of Druid 0.8.2 RC1!

You can download it here:

http://static.druid.io/artifacts/releases/druid-0.8.2-rc1-bin.tar.gz

Documentation is available at:

http://druid.io/docs/latest

Please file GitHub issues if you find any bugs!

Thanks to all of you who contributed issues, docs, and code!

Release notes are below:

Updating from 0.8.1
There should be no update concerns and standard updating procedures can be followed for rolling updates

New Features

  • #1800 Hybrid L1/L2 cache
  • #1753 Allow SegmentMetadataQuery to skip cardinality and size calculations
  • #1744 Memcached connection pooling
  • #1609 Experimental kafa simple consumer based firehose

Improvements

  • #1821 cache max data timestamp in QueryableIndexStorageAdapter
  • #1765 Add CPUTimeMetricQueryRunner to ClientQuerySegmentWalker
  • #1776 Modified the Twitter firehose to process more properties
  • #1748 Allow ForkingTaskRunner javaOpts to have quoted arguments which contain spaces
  • #1759 better faster smaller roaring bitmaps
  • #1755 update druid-api for timestamp parsing speedup
  • #1756 improving msging when indexing service is not found
  • #1739 Allow SegmentAnalyzer to read columns from StorageAdapter, allow SegmentMetadataQuery to query IncrementalIndexSegments on realtime node
  • #1732 Add support for a configurable default segment history period for segmentMetadata queries and GET /datasources/ lookups
  • #1695 Allow writing InputRowParser extensions that use hadoop/any libraries
  • #1688 More memcached metrics
  • #1712 Add dimension extraction functionality to SearchQuery
  • #1696 Add CPU time to metrics for segment scanning.
  • #1718 Adds task duration to indexer console for completed tasks.
  • #1725 Don’t check for sortedness if we already know GenericIndexedWriter isn’t sorted
  • #1699 composing emitter module to use multiple emitters together
  • #1639 New plumber
  • #1604 Allow task to override ForkingTaskRunner tunings and jvm settings
  • #1542 add endpoint to fetch rule history for all datasources
  • #1682 Support parsing of BytesWritable strings in HadoopDruidIndexerMapper
  • #1622 Support for JSON Smile format for EventReceiverFirehoseFactory
  • #1654 Add ability to provide taskResource for IndexTask.

Bug Fixes

  • #1822 support multiple non-consecutive intervals in outer query of nested group-by
  • #1811 Server discovery selector ipv6 friendly
  • #1823 For dataSource inputSpec in hadoop batch ingestion, use configured query granularity for reading existing segments instead of NONE
  • #1818 Add hashCode and equals to stock lookups
  • #1812 Bump server-metrics to 0.2.5 to catch a few fixes.
  • #1806 Fix index exceeded msg to give maxRowCount as well
  • #1801 Fix ClientInfoResource
  • #1795 Try and make AnnouncerTest a bit more predictable
  • #1797 ingest segment firehose ut
  • #1798 Update httpcomponents and aws-sdk
  • #1792 GroupByQueryRunnerTest for hyperUnique finalizing post aggregators
  • #1781 Fix failure in nested groupBy with multiple aggregators with same fie…
  • #1790 Cleanup kafka-extraction-namespace
  • #1782 Add analysisTypes to SegmentMetadataQuery cache key
  • #1730 fix #1727 - Union bySegment queries fix
  • #1783 Separate ListColumnIncluderator cache key parts with nul bytes
  • #1740 fix #1715 - Zombie tasks able to acquire locks after failure
  • #1778 Redirect fixes
  • #1777 fail task if finishjob throws any exception
  • #1775 SQLMetadataConnector: Retry table creation, in case something goes wrong.
  • #1772 RemoteTaskRunner: Fix for starting an overlord before any workers ever existed.
  • #1764 Enable logging for memcached in factory
  • #1760 Update memcached client for better concurrency in metrics.
  • #1761 LocalDataSegmentPusher: Fix for Hadoop + relative paths.
  • #1763 fix NPE and duplicate metric keys
  • #1758 Fix memcached cache provider injection and add test
  • #1747 Account for potential gaps in hydrants in sink initialization, hydrant swapping (e.g. h0, h1, h4)
  • #1751 Soften concurrency requirements on IncrementalIndexTest
  • #1736 IngestSegmentFirehostFactoryTimelineTest for overshadowing of the middle of a segment.
  • #1741 Add better concurrency testing to IncrementalIndexTest
  • #1743 Disable metadata publishing attempt in example script
  • #1697 Better logging of URIExtractionNamespace failures due to missing files
  • #1702 do not have dataSource twice in path to segment storage on hdfs
  • #1710 Add some basic latching to concurrency testing in IncrementalIndexTest
  • #1734 fix broken integration-test
  • #1731 fix NPE with regex extraction function
  • #1700 update indexing in the helper to use multiple persists and merge
  • #1721 fix for “java.io.IOException: No FileSystem for scheme: hdfs” error
  • #1694 Better timing and locking in NamespaceExtractionCacheManagerExecutorsTest
  • #1703 add null check for task context.
  • #1637 Make jetty scheduler threads daemon thread
  • #1658 Hopefully add better timeouts and ordering to JDBCExtractionNamespaceTest
  • #1620 Allow long values in the key or value fields for URIExtractionNamespace
  • #1578 Fix UT and documentation to the extraction filter
  • #1687 do not let user override hadoop job settings explicitly provided by druid code
  • #1689 Update LZ4Transcoder to match Compressed strategy factory type.
  • #1685 Close output streams and channels loudly when creating segments.
  • #1686 Replace funky imports with standard ones.
  • #1683 Remove unused Indexer interface.
  • #1632 Inner Query should build on sub query
  • #1676 fix convert segment task
  • #1672 Migrate TestDerbyConnector to a JUnit @Rule
  • #1675 update druid-api for jackson 2.4.6
  • #1632 Inner Query should build on sub query
  • #1668 Code cleanup for CachingClusteredClientTest
  • #1669 Upgrade dependencies
  • #1663 TaskActionToolbox: Remove allowOlderVersions, lift interval constraint
  • #1619 update server metrics
  • #1661 Poll rules immediately after change
  • #1665 Consolidate SQL retrying by moving logic into the connectors.
  • #1648 handle commas in the path before calling MultipleInputs.addInputPaths
  • #1647 Approx histogram “integration” unit test

Documentation

  • #1814 Adjust realtime constraints in the docs.
  • #1794 update R / Python clients
  • #1784 Minor documentation fixes for CONTRIBUTING.md
  • #1774 update doc about aggregation field in merge task and a null check
  • #1742 add docs for search filter
  • #1737 Docs: Suggest hadoopyString parser for Hadoop.
  • #1735 add pivot as a UI
  • #1723 fix typo in segments.md
  • #1720 update ingestion faq to mention dataSource inputSpec
  • #1717 in configuration/index.md s/instantialize/initialize
  • #1698 Timeseries skipEmptyBucket docs.
  • #1662 Add documentation for pathFormat in batch ingestion
  • #1673 Fix batch ingestion doc
  • #1670 fix formatting
  • #1656 more docs for common questions
  • #1664 add documentation about TimedShutoff firehose
  • #1633 swap description and dimension column for some JVM metrics
  • #1793 fixing the link to chunkPeriod doc

Thanks to all the contributors to this release!
@anwenxu @cheddar @dclim @drcrallen @fjy @gianm @guobingkun @Hailei @himanshug @jon-wei @nishantmonu51 @pjain1 @potto007 @qix @rasahner @xvrl

I’m going to pin this.

Awesome, can’t wait to give it a spin.