Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

PythonicNinja/pydrill

Repository files navigation

pydrill

Documentation Status https://coveralls.io/repos/github/PythonicNinja/pydrill/badge.svg?branch=master

Python Driver for Apache Drill.

Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage

Features

  • Python 2/3 compatibility,
  • Support for all rest API calls inluding profiles/options/metrics docs with full list.
  • Mapping Results to internal python types,
  • Compatibility with Pandas data frame,
  • Drill Authentication using PAM,

Installation

Version from https://pypi.python.org/pypi/pydrill:

$ pip install pydrill

Latest version from git:

$ pip install git+git://github.com/PythonicNinja/pydrill.git

Sample usage

from pydrill.client import PyDrill
drill = PyDrill(host='localhost', port=8047)
if not drill.is_active():
 raise ImproperlyConfigured('Please run Drill first')
yelp_reviews = drill.query('''
 SELECT * FROM
 `dfs.root`.`./Users/macbookair/Downloads/yelp_dataset_challenge_academic_dataset/yelp_academic_dataset_review.json`
 LIMIT 5
''')
for result in yelp_reviews:
 print("%s: %s" %(result['type'], result['date']))
# pandas dataframe
df = yelp_reviews.to_dataframe()
print(df[df['stars'] > 3])

About

Python Driver for Apache Drill.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

AltStyle によって変換されたページ (->オリジナル) /