Druid 0.8.2 RC2

Druid 0.8.2 RC2 was released yesterday with a couple more bug fixes!

#1868 Removing parent paths causes watchers of the “announcements” path to get stuck

#1855 fix [GreaterThan,LessThan,Equals] HavingSpecs

#1862 Add timeout to shutdown request to middle manager for indexing service

We also updated upgrade procedures in the release notes to avoid some edge cases.

Unless something major comes up we expect this version to be marked stable in about a week or so.

Download is available here

http://static.druid.io/artifacts/releases/druid-0.8.2-rc2-bin.tar.gz

Documentation:

http://druid.io/docs/latest

Updated Release notes are below:

Updating from 0.8.1

If you are using union queries, please make sure to update broker nodes prior to updating any historical nodes, realtime nodes, or indexing service.

Otheriwse, you can follow standard rolling update procedures.

New Features

#1744 Memcached connection pooling

#1753 Allow SegmentMetadataQuery to skip cardinality and size calculations

#1609 Experimental kafa simple consumer based firehose

#1800 Experimental Hybrid L1/L2 cache

Improvements

#1821 cache max data timestamp in QueryableIndexStorageAdapter

#1765 Add CPUTimeMetricQueryRunner to ClientQuerySegmentWalker

#1776 Modified the Twitter firehose to process more properties

#1748 Allow ForkingTaskRunner javaOpts to have quoted arguments which contain spaces

#1759 better faster smaller roaring bitmaps

#1755 update druid-api for timestamp parsing speedup

#1756 improving msging when indexing service is not found

#1739 Allow SegmentAnalyzer to read columns from StorageAdapter, allow SegmentMetadataQuery to query IncrementalIndexSegments on realtime node

#1732 Add support for a configurable default segment history period for segmentMetadata queries and GET /datasources/ lookups

#1695 Allow writing InputRowParser extensions that use hadoop/any libraries

#1688 More memcached metrics

#1712 Add dimension extraction functionality to SearchQuery

#1696 Add CPU time to metrics for segment scanning.

#1718 Adds task duration to indexer console for completed tasks.

#1725 Don’t check for sortedness if we already know GenericIndexedWriter isn’t sorted

#1699 composing emitter module to use multiple emitters together

#1639 New plumber

#1604 Allow task to override ForkingTaskRunner tunings and jvm settings

#1542 add endpoint to fetch rule history for all datasources

#1682 Support parsing of BytesWritable strings in HadoopDruidIndexerMapper

#1622 Support for JSON Smile format for EventReceiverFirehoseFactory

#1654 Add ability to provide taskResource for IndexTask.

Bug Fixes

#1868 Removing parent paths causes watchers of the “announcements” path to get stuck

#1855 fix [GreaterThan,LessThan,Equals] HavingSpecs

#1862 Add timeout to shutdown request to middle manager for indexing service

#1822 support multiple non-consecutive intervals in outer query of nested group-by

#1811 Server discovery selector ipv6 friendly

#1823 For dataSource inputSpec in hadoop batch ingestion, use configured query granularity for reading existing segments instead of NONE

#1818 Add hashCode and equals to stock lookups

#1812 Bump server-metrics to 0.2.5 to catch a few fixes.

#1806 Fix index exceeded msg to give maxRowCount as well

#1801 Fix ClientInfoResource

#1795 Try and make AnnouncerTest a bit more predictable

#1797 ingest segment firehose ut

#1798 Update httpcomponents and aws-sdk

#1792 GroupByQueryRunnerTest for hyperUnique finalizing post aggregators

#1781 Fix failure in nested groupBy with multiple aggregators with same fie…

#1790 Cleanup kafka-extraction-namespace

#1782 Add analysisTypes to SegmentMetadataQuery cache key

#1730 fix #1727 - Union bySegment queries fix

#1783 Separate ListColumnIncluderator cache key parts with nul bytes

#1740 fix #1715 - Zombie tasks able to acquire locks after failure

#1778 Redirect fixes

#1777 fail task if finishjob throws any exception

#1775 SQLMetadataConnector: Retry table creation, in case something goes wrong.

#1772 RemoteTaskRunner: Fix for starting an overlord before any workers ever existed.

#1764 Enable logging for memcached in factory

#1760 Update memcached client for better concurrency in metrics.

#1761 LocalDataSegmentPusher: Fix for Hadoop + relative paths.

#1763 fix NPE and duplicate metric keys

#1758 Fix memcached cache provider injection and add test

#1747 Account for potential gaps in hydrants in sink initialization, hydrant swapping (e.g. h0, h1, h4)

#1751 Soften concurrency requirements on IncrementalIndexTest

#1736 IngestSegmentFirehostFactoryTimelineTest for overshadowing of the middle of a segment.

#1741 Add better concurrency testing to IncrementalIndexTest

#1743 Disable metadata publishing attempt in example script

#1697 Better logging of URIExtractionNamespace failures due to missing files

#1702 do not have dataSource twice in path to segment storage on hdfs

#1710 Add some basic latching to concurrency testing in IncrementalIndexTest

#1734 fix broken integration-test

#1731 fix NPE with regex extraction function

#1700 update indexing in the helper to use multiple persists and merge

#1721 fix for “java.io.IOException: No FileSystem for scheme: hdfs” error

#1694 Better timing and locking in NamespaceExtractionCacheManagerExecutorsTest

#1703 add null check for task context.

#1637 Make jetty scheduler threads daemon thread

#1658 Hopefully add better timeouts and ordering to JDBCExtractionNamespaceTest

#1620 Allow long values in the key or value fields for URIExtractionNamespace

#1578 Fix UT and documentation to the extraction filter

#1687 do not let user override hadoop job settings explicitly provided by druid code

#1689 Update LZ4Transcoder to match Compressed strategy factory type.

#1685 Close output streams and channels loudly when creating segments.

#1686 Replace funky imports with standard ones.

#1683 Remove unused Indexer interface.

#1632 Inner Query should build on sub query

#1676 fix convert segment task

#1672 Migrate TestDerbyConnector to a JUnit @Rule

#1675 update druid-api for jackson 2.4.6

#1632 Inner Query should build on sub query

#1668 Code cleanup for CachingClusteredClientTest

#1669 Upgrade dependencies

#1663 TaskActionToolbox: Remove allowOlderVersions, lift interval constraint

#1619 update server metrics

#1661 Poll rules immediately after change

#1665 Consolidate SQL retrying by moving logic into the connectors.

#1648 handle commas in the path before calling MultipleInputs.addInputPaths

#1647 Approx histogram “integration” unit test

Documentation

#1814 Adjust realtime constraints in the docs.

#1794 update R / Python clients

#1784 Minor documentation fixes for CONTRIBUTING.md

#1774 update doc about aggregation field in merge task and a null check

#1742 add docs for search filter

#1737 Docs: Suggest hadoopyString parser for Hadoop.

#1735 add pivot as a UI

#1723 fix typo in segments.md

#1720 update ingestion faq to mention dataSource inputSpec

#1717 in configuration/index.md s/instantialize/initialize

#1698 Timeseries skipEmptyBucket docs.

#1662 Add documentation for pathFormat in batch ingestion

#1673 Fix batch ingestion doc

#1670 fix formatting

#1656 more docs for common questions

#1664 add documentation about TimedShutoff firehose

#1633 swap description and dimension column for some JVM metrics

#1793 fixing the link to chunkPeriod doc

Thanks to all the contributors to this release!

@anwenxu @cheddar @dclim @drcrallen @fjy @gianm @guobingkun @Hailei @himanshug @jon-wei @nishantmonu51 @pjain1 @potto007 @qix @rasahner @xvrl

I said this in a number of other threads but wanted to mention it here as well.
If you are using Pivot ( https://github.com/implydata/pivot ) and upgrade to Druid 0.8.2 you should use the --use-segment-metadata flag for a much better experience with introspection. Thank you Druid team for making segmentMetadata query go from sucky to awesome.