[フレーム]
PDF, PPTX1,214 views

Apache Kylin Open Source Journey for QCon2015 Beijing

Apache Kylin is an open-source distributed analytics engine for big data that supports SQL and multi-dimensional analysis on Hadoop, with its initial release in October 2014. It aims to facilitate building a community and ecosystem around the project, highlighting its evolution from an internal eBay initiative to an Apache incubator project. The document covers technical challenges, architecture, features, community building efforts, and marketing strategies associated with Apache Kylin.

Embed presentation

Download as PDF, PPTX
Apache Kylin Open Source Journey 韩卿 | Luke Han Co-Creator & PMC Member lukehan@apache.org 2015-­‐04-­‐25
Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
About Apache Kylin (麒麟) Extreme OLAP Engine for Big Data http://kylin.io Kylin is an open source Distributed Analytics Engine that provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets • First Apache Project open sourced by eBay Inc. • First Apache Project fully contributed from eBay CCOE • Open Sourced on Oct 1st, 2014 • Be accepted as Apache Incubator Project on Nov 25th, 2014 • Apache Kylin is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by Incubator.
Technical Challenges • Huge volume data – Table scan • Big table joins – Data shuffling • Analysis on different granularity – Runtime aggregation expensive • Map Reduce job – Batch processing
Apache Kylin Architecture Cube Build Engine (MapReduce, Streaming...) SQL Low Latency -­‐ Seconds Mid Latency -­‐ Minutes Routing 3rd Party App (Web App, Mobile...) Metadata SQL-­‐Based Tool (BI Tools: Tableau...) Query Engine Hadoop Hive REST API JDBC/ODBC ➢ Online Analysis Data Flow ➢ Offline Data Flow ➢ Clients/Users interactive with Kylin via SQL ➢ OLAP Cube is transparent to users Star Schema Data Key Value Data Data Cube OLAP Cube (HBase) SQL REST Server
Features • Extremely Fast OLAP Engine at scale • ANSI SQL Interface on Hadoop • Seamless Integration with BI Tools, like Tableau • Interactive Query Capability • MOLAP Cube • Compression and Encoding Support • Incremental Build of Cubes • Approximate Query Capability for Distinct Count (HyperLogLog) • Leverage HBase Coprocessor for query latency • Job Management and Monitoring • User friendly Web GUI for manage, build, monitor and query cubes • Security capability to set ACL at Cube/Project Level • Support LDAP Integration • Streaming Support Coming soon! 6 90%$le'queries'<5s'
Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
Jun 2014 US#Patent#Filed# Kylin Open Source Journey Sep 2013 Ini$a$ve( Jan 2014 POC$Completed$ Jul 2014 V1.0%Beta%Released% Oct 2014 V1.0%GA%Released% Open%Sourced% Apache Top Project Nov 2014 Apache'' Incubator'Project'
Ready for Open Source • Open Source from Day One • Internal vs External • Intellectual Property • Legal • Domain • License – Apache/MIT/BSD/GPL... • Team
Patent • Why? • How? • Patent vs Open Source
Phase I: Open Source on Github • Code pushed to github.com on Oct 1st, 2014
Phase II: Apache Incubator • Be accepted as Apache Incubator Project on Nov 25th, 2014
Why & How Apache? • Hadoop Ecosystem Home • Branding • Community • The Apache Way
Incubation Progress
• IPMC & PPMC • Mentors and Champion • Committers Incubator Project Proposal
Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
Infrastructure Setup • Mailing List – Private@ – Dev@ • Source Code Repo – git & svn – Migration • Website • JIRA • Wiki
IP Clearance & Release • Kylin for brand name? • Apache License • GPL Dependency? • Apache Release • README, LICENSE, NOTICS, DECLIARMER • Source Headers • Licensing of dependencies • Binaries 18
Team onboard Apache Way • Community then Code • Mailing list discussions • Vote • Code Quality and Style • JIRA for each issue, feature • Merge Pull Request • Recruiting contributor/committer 19
How to contribute? • Join mailing list: • dev@kylin.incubator.apache.org • Create JIRA or Leave Comments • Pull Request/Patch to Apache Github Mirror 20
Graduate to Top Project 21 • Diversity • Complete (and sign off) tasks documented in the status file • Ensure suitability for project name and product name • Demonstrate ability to create Apache releases • Demonstrate community readiness • Ensure that mentors and the IPMC have no remaining issues
Ready to Apache? 22
Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
Build Community and Ecosystem • What’s community? • How to grow community? • Community than Code!
Marketing -­‐ Website • http://kylin.io – Hosted on github.io (Github Pages) – Hosted on Apache Infra Server – http://kylin.incubator.apache.org
Marketing -­‐ Blog • Publish via eBay Tech Blog to gain focus from industry • http://www.ebaytechblog.com/2014/10/20/announcing-­‐kylin-­‐extreme-­‐olap-­‐engine-­‐for-­‐big-­‐data "Like arch-­‐rival Amazon.com, the soon-­‐to-­‐split eBay Inc. is something of an oddity in that it hasn’t historically been a big contributor to the open-­‐source community. But the e-­‐ commerce pioneer hopes to change that with the release of the source-­‐code for a homegrown online analytics processing (OLAP) engine that promises to speed up Hadoop while also making it more accessible to everyday enterprise users." -­‐-­‐ siliconangle.com
Marketing – Social Media • Github • KylinOLAP • Twitter – @ApacheKylin • HackNews • Facebook – Page: kylin.io • LinkedIn – Group: Kylin • WeChat(微信) – ApacheKylin • ...
Marketing -­‐ Media • InfoQ • CSDN • OSChina • ... 28
Build Community – Mailing List
Build Community – Meetup • Hive Meetup Bay Area, Dec 2014 • Apache Kylin Meetup Bay Area, Dec 2014 • Apache Kylin Tech Talk @AWS Seattle, Dec 2014 • Apache Kylin Meetup Beijing, Dec 2014 • Spark Meetup Bay Area, March 2015 • Kylin Meetup in China, coming soon • ...
• Big Data Summit Shanghai, Oct 2014 • Big Data Technology Conference Beijing, Dec 2014 • Database Technology Conference Beijing, April 2015 • Hadoop Summit Europe, April 2015 • QCon Beijing, April 2015 • Strata+Hadoop World London, May 2015 • HBaseCon San Francisco, May 2015 • Hadoop Summit San Jose, June 2015 • ... Build Community – Conference
Know your community • Google Analytics • Github Statistics • Mailing List • WeChat • ...
Apache Kylin Ecosystem Kylin OLAP Core Extension ! Security ! Redis Storage ! Spark Engine ! Docker Interface ! Web Console ! Customized BI ! Ambari/Hue Plugin Integration ! ODBC Driver ! ETL ! Drill ! SparkSQL • Kylin Core • Fundamental framework of Kylin OLAP Engine •Extension – Plugins to support for additional functions and features •Integration – Lifecycle Management Support to integrate with other applications like BI tools •Interface – Allows for third party users to build more features via user-interface atop Kylin core
Apache Kylin Evolution Roadmap 2015%2014%2013% Ini$al% Prototype. for.MOLAP. • Basic.end.to.end. POC. . MOLAP. • Incremental. Refresh. • ANSI.SQL. • ODBC.Driver. • Web.GUI. • ACL. • Open.Source% HOLAP. • Streaming.OLAP. • JDBC.Driver. • New.GUI. • Excel.Support. • SparkSQL. • ....more. % . Next.Gen. • Lambda.Arch. • Automa$on. • Capacity. Management. • InNMemory. Analysis.(TBD). • Spark.(TBD). • Mobile.(TBD). • ....more. TBD. Future...% Sep,%2013% Jan,%2014% Sep,%2014% H1,%2015%
Excellence of Engineering Recruit best people Done is better than perfect Do academic research Explain design in simple words Everyone does dirty work You write first version, I write second one Debate, Decision & Delivery 35 Team Philosophy
Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
• 知名度 • 个人人成⻓长 • 团队文文化 • 项⺫目目质量 • 成就感 • 和牛牛人人做邻居 全世界都在注视着你和你的代码! The Good 37
The Bad • 开发效率降低 • 内部项⺫目目进度vs外部支支持和问题 • 业余时间 • Roadmap and Features from external 38
The Ugly • 开源不等于免费 • 请尊重开源作者 • Ask question with right way 39
If you want to go fast, go alone. If you want to go far, go together. !!African)Proverb)
• Kylin Site: – http://kylin.incubator.apache.org – http://kylin.io • Twitter: – @ApacheKylin • WeChat(微信) – ApacheKylin Apache Kylin
@InfoQ infoqchina

More Related Content

6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
PDF
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
Apache Kylin Introduction
PPTX
Apache Kylin Introduction
Apache kylin - Big Data Technology Conference 2014 Beijing
PPTX
Apache kylin - Big Data Technology Conference 2014 Beijing
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
PDF
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
Adding Spark support to Kylin at Bay Area Spark Meetup
PPTX
Adding Spark support to Kylin at Bay Area Spark Meetup
Kylin OLAP Engine Tour
PPTX
Kylin OLAP Engine Tour
The Evolution of Apache Kylin by Luke Han
PDF
The Evolution of Apache Kylin by Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
PDF
The Apache Way - Building Open Source Community in China - Luke Han
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
Apache Kylin Introduction
Apache Kylin Introduction
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 Beijing
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
Kylin OLAP Engine Tour
Kylin OLAP Engine Tour
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke Han

What's hot

Apache kylin (china hadoop summit 2015 shanghai)
PPTX
Apache kylin (china hadoop summit 2015 shanghai)
Apache Kylin Extreme OLAP Engine for Big Data
PPTX
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
PPTX
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Kylin olap part 1- getting started
PPTX
Kylin olap part 1- getting started
Big Data MDX with Mondrian and Apache Kylin
PDF
Big Data MDX with Mondrian and Apache Kylin
Apache Kylin’s Performance Boost from Apache HBase
PPTX
Apache Kylin’s Performance Boost from Apache HBase
Apache Kylin Streaming
PPTX
Apache Kylin Streaming
Apache Kylin 1.5 Updates
PPTX
Apache Kylin 1.5 Updates
Apache Kylin – Cubes on Hadoop
PPTX
Apache Kylin – Cubes on Hadoop
Apache Kylin - Balance between space and time - Hadoop Summit 2015
PDF
Apache Kylin - Balance between space and time - Hadoop Summit 2015
Apache Kylin Use Cases in China and Japan
PDF
Apache Kylin Use Cases in China and Japan
Apache Kylin on HBase: Extreme OLAP engine for big data
PPTX
Apache Kylin on HBase: Extreme OLAP engine for big data
Kylin Engineering Principles
PPTX
Kylin Engineering Principles
Design cube in Apache Kylin
PPTX
Design cube in Apache Kylin
Apache kylin 2.0: from classic olap to real-time data warehouse
PPTX
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache Kylin @ Big Data Europe 2015
PPTX
Apache Kylin @ Big Data Europe 2015
Apache Kylin - OLAP Cubes for SQL on Hadoop
PPTX
Apache Kylin - OLAP Cubes for SQL on Hadoop
The Evolution of Apache Kylin
PPTX
The Evolution of Apache Kylin
Datacubes in Apache Hive at ApacheCon
PPTX
Datacubes in Apache Hive at ApacheCon
ApacheKylin_HBaseCon2015
PPTX
ApacheKylin_HBaseCon2015
Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Kylin olap part 1- getting started
Kylin olap part 1- getting started
Big Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylin
Apache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBase
Apache Kylin Streaming
Apache Kylin Streaming
Apache Kylin 1.5 Updates
Apache Kylin 1.5 Updates
Apache Kylin – Cubes on Hadoop
Apache Kylin – Cubes on Hadoop
Apache Kylin - Balance between space and time - Hadoop Summit 2015
Apache Kylin - Balance between space and time - Hadoop Summit 2015
Apache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and Japan
Apache Kylin on HBase: Extreme OLAP engine for big data
Apache Kylin on HBase: Extreme OLAP engine for big data
Kylin Engineering Principles
Kylin Engineering Principles
Design cube in Apache Kylin
Design cube in Apache Kylin
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
The Evolution of Apache Kylin
The Evolution of Apache Kylin
Datacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheCon
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015

Similar to Apache Kylin Open Source Journey for QCon2015 Beijing

HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
PPTX
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
Apache Kylin and Use Cases - 2018 Big Data Spain
PDF
Apache Kylin and Use Cases - 2018 Big Data Spain
Accelerating Big Data Analytics with Apache Kylin
PDF
Accelerating Big Data Analytics with Apache Kylin
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
PPTX
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache Kylin 101
PPTX
Apache Kylin 101
Apache Kylin Meetup: Berlin - With OLX Group
PDF
Apache Kylin Meetup: Berlin - With OLX Group
Apache kylin meetup berlin olx v1.0
PDF
Apache kylin meetup berlin olx v1.0
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
PDF
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Cloud-native Semantic Layer on Data Lake
PDF
Cloud-native Semantic Layer on Data Lake
introduction and use of apache_kylo.pptx
PPTX
introduction and use of apache_kylo.pptx
Open Source Technologies in the Analytics Revolution
PPTX
Open Source Technologies in the Analytics Revolution
Apache kylin boost your SQLs on extremely large dataset
PDF
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
PDF
Apache kylin boost your sqls on extremely large dataset
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
PDF
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
From a student to an apache committer practice of apache io tdb
PDF
From a student to an apache committer practice of apache io tdb
Apache Kylin - Balance Between Space and Time
PDF
Apache Kylin - Balance Between Space and Time
Building Enterprise OLAP on Hadoop for FSI
PPTX
Building Enterprise OLAP on Hadoop for FSI
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
PPTX
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
Kylin and Druid Presentation
PDF
Kylin and Druid Presentation
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
Accelerating Big Data Analytics with Apache Kylin
Accelerating Big Data Analytics with Apache Kylin
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache Kylin 101
Apache Kylin 101
Apache Kylin Meetup: Berlin - With OLX Group
Apache Kylin Meetup: Berlin - With OLX Group
Apache kylin meetup berlin olx v1.0
Apache kylin meetup berlin olx v1.0
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
introduction and use of apache_kylo.pptx
introduction and use of apache_kylo.pptx
Open Source Technologies in the Analytics Revolution
Open Source Technologies in the Analytics Revolution
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
From a student to an apache committer practice of apache io tdb
From a student to an apache committer practice of apache io tdb
Apache Kylin - Balance Between Space and Time
Apache Kylin - Balance Between Space and Time
Building Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSI
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
Kylin and Druid Presentation
Kylin and Druid Presentation

More from Luke Han

Augmented OLAP for Big Data
PDF
Augmented OLAP for Big Data
Refactoring your EDW with Mobile Analytics Products
PPTX
Refactoring your EDW with Mobile Analytics Products
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
PDF
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
PDF
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
PPTX
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Actuate presentation 2011
PPTX
Actuate presentation 2011
Augmented OLAP for Big Data
Augmented OLAP for Big Data
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics Products
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Actuate presentation 2011
Actuate presentation 2011

Recently uploaded

The future of software delivery is agentic
PDF
The future of software delivery is agentic
Communicating Software Architecture using Arc42
PPTX
Communicating Software Architecture using Arc42
Tableau Alternative Offers the Best Data Visualization Experience.pptx
PPTX
Tableau Alternative Offers the Best Data Visualization Experience.pptx
6 Hotel Booking Trends You Can’t Ignore in 2025.pdf
PDF
6 Hotel Booking Trends You Can’t Ignore in 2025.pdf
Streamline Your Production Process with ERP Solutions
PPTX
Streamline Your Production Process with ERP Solutions
Data Governance and Compliance Choosing a Tableau Replacement with Strong Con...
DOCX
Data Governance and Compliance Choosing a Tableau Replacement with Strong Con...
Python_Lecture12_SearchingandSorting.pptx
PPTX
Python_Lecture12_SearchingandSorting.pptx
Wired_AnalyticsTraineeship_13112025_clean.pptx
PPTX
Wired_AnalyticsTraineeship_13112025_clean.pptx
Top 10 Ways AI Can Improve SEO Strategies in 2025.pdf
PDF
Top 10 Ways AI Can Improve SEO Strategies in 2025.pdf
PPT GIS Origin and introduction, raster data
PPTX
PPT GIS Origin and introduction, raster data
Best AI Agent Development Company | Intelligent Automation Solutions
PPTX
Best AI Agent Development Company | Intelligent Automation Solutions
Unit 1 - Machine Learning Basics AUCE.docx
DOCX
Unit 1 - Machine Learning Basics AUCE.docx
Free Versus Paid Enterprise IT Monitoring Tools
PDF
Free Versus Paid Enterprise IT Monitoring Tools
Python_Lecture13_Introduction to PyQt.pptx
PPTX
Python_Lecture13_Introduction to PyQt.pptx
Mobile App Accessibility Standards Every Developer Should Know.pptx
PPTX
Mobile App Accessibility Standards Every Developer Should Know.pptx
Enterprise Reporting Made Easy for Actionable Project Insights.pdf
PDF
Enterprise Reporting Made Easy for Actionable Project Insights.pdf
242017752-SAP-Asset-Accoungting-Training.ppt
PPT
242017752-SAP-Asset-Accoungting-Training.ppt
AI and Automation Software | Best AI and Automation Software
PPTX
AI and Automation Software | Best AI and Automation Software
AstroBirdz Token – Smart Contract Security Audit Report by EtherAuthority
PDF
AstroBirdz Token – Smart Contract Security Audit Report by EtherAuthority
"SiyanoAV: Preventing Data Breaches in 2025"
PDF
"SiyanoAV: Preventing Data Breaches in 2025"
The future of software delivery is agentic
The future of software delivery is agentic
Communicating Software Architecture using Arc42
Communicating Software Architecture using Arc42
Tableau Alternative Offers the Best Data Visualization Experience.pptx
Tableau Alternative Offers the Best Data Visualization Experience.pptx
6 Hotel Booking Trends You Can’t Ignore in 2025.pdf
6 Hotel Booking Trends You Can’t Ignore in 2025.pdf
Streamline Your Production Process with ERP Solutions
Streamline Your Production Process with ERP Solutions
Data Governance and Compliance Choosing a Tableau Replacement with Strong Con...
Data Governance and Compliance Choosing a Tableau Replacement with Strong Con...
Python_Lecture12_SearchingandSorting.pptx
Python_Lecture12_SearchingandSorting.pptx
Wired_AnalyticsTraineeship_13112025_clean.pptx
Wired_AnalyticsTraineeship_13112025_clean.pptx
Top 10 Ways AI Can Improve SEO Strategies in 2025.pdf
Top 10 Ways AI Can Improve SEO Strategies in 2025.pdf
PPT GIS Origin and introduction, raster data
PPT GIS Origin and introduction, raster data
Best AI Agent Development Company | Intelligent Automation Solutions
Best AI Agent Development Company | Intelligent Automation Solutions
Unit 1 - Machine Learning Basics AUCE.docx
Unit 1 - Machine Learning Basics AUCE.docx
Free Versus Paid Enterprise IT Monitoring Tools
Free Versus Paid Enterprise IT Monitoring Tools
Python_Lecture13_Introduction to PyQt.pptx
Python_Lecture13_Introduction to PyQt.pptx
Mobile App Accessibility Standards Every Developer Should Know.pptx
Mobile App Accessibility Standards Every Developer Should Know.pptx
Enterprise Reporting Made Easy for Actionable Project Insights.pdf
Enterprise Reporting Made Easy for Actionable Project Insights.pdf
242017752-SAP-Asset-Accoungting-Training.ppt
242017752-SAP-Asset-Accoungting-Training.ppt
AI and Automation Software | Best AI and Automation Software
AI and Automation Software | Best AI and Automation Software
AstroBirdz Token – Smart Contract Security Audit Report by EtherAuthority
AstroBirdz Token – Smart Contract Security Audit Report by EtherAuthority
"SiyanoAV: Preventing Data Breaches in 2025"
"SiyanoAV: Preventing Data Breaches in 2025"

Apache Kylin Open Source Journey for QCon2015 Beijing

  • 1.
    Apache Kylin Open Source Journey 韩卿 | Luke Han Co-Creator & PMC Member lukehan@apache.org 2015-­‐04-­‐25
  • 2.
    Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 3.
    About Apache Kylin (麒麟) Extreme OLAP Engine for Big Data http://kylin.io Kylin is an open source Distributed Analytics Engine that provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets • First Apache Project open sourced by eBay Inc. • First Apache Project fully contributed from eBay CCOE • Open Sourced on Oct 1st, 2014 • Be accepted as Apache Incubator Project on Nov 25th, 2014 • Apache Kylin is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by Incubator.
  • 4.
    Technical Challenges • Huge volume data – Table scan • Big table joins – Data shuffling • Analysis on different granularity – Runtime aggregation expensive • Map Reduce job – Batch processing
  • 5.
    Apache Kylin Architecture Cube Build Engine (MapReduce, Streaming...) SQL Low Latency -­‐ Seconds Mid Latency -­‐ Minutes Routing 3rd Party App (Web App, Mobile...) Metadata SQL-­‐Based Tool (BI Tools: Tableau...) Query Engine Hadoop Hive REST API JDBC/ODBC ➢ Online Analysis Data Flow ➢ Offline Data Flow ➢ Clients/Users interactive with Kylin via SQL ➢ OLAP Cube is transparent to users Star Schema Data Key Value Data Data Cube OLAP Cube (HBase) SQL REST Server
  • 6.
    Features • Extremely Fast OLAP Engine at scale • ANSI SQL Interface on Hadoop • Seamless Integration with BI Tools, like Tableau • Interactive Query Capability • MOLAP Cube • Compression and Encoding Support • Incremental Build of Cubes • Approximate Query Capability for Distinct Count (HyperLogLog) • Leverage HBase Coprocessor for query latency • Job Management and Monitoring • User friendly Web GUI for manage, build, monitor and query cubes • Security capability to set ACL at Cube/Project Level • Support LDAP Integration • Streaming Support Coming soon! 6 90%$le'queries'<5s'
  • 7.
    Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 8.
    Jun 2014 US#Patent#Filed# Kylin Open Source Journey Sep 2013 Ini$a$ve( Jan 2014 POC$Completed$ Jul 2014 V1.0%Beta%Released% Oct 2014 V1.0%GA%Released% Open%Sourced% Apache Top Project Nov 2014 Apache'' Incubator'Project'
  • 9.
    Ready for Open Source • Open Source from Day One • Internal vs External • Intellectual Property • Legal • Domain • License – Apache/MIT/BSD/GPL... • Team
  • 10.
    Patent • Why? • How? • Patent vs Open Source
  • 11.
    Phase I: Open Source on Github • Code pushed to github.com on Oct 1st, 2014
  • 12.
    Phase II: Apache Incubator • Be accepted as Apache Incubator Project on Nov 25th, 2014
  • 13.
    Why & How Apache? • Hadoop Ecosystem Home • Branding • Community • The Apache Way
  • 14.
  • 15.
    • IPMC & PPMC • Mentors and Champion • Committers Incubator Project Proposal
  • 16.
    Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 17.
    Infrastructure Setup • Mailing List – Private@ – Dev@ • Source Code Repo – git & svn – Migration • Website • JIRA • Wiki
  • 18.
    IP Clearance & Release • Kylin for brand name? • Apache License • GPL Dependency? • Apache Release • README, LICENSE, NOTICS, DECLIARMER • Source Headers • Licensing of dependencies • Binaries 18
  • 19.
    Team onboard Apache Way • Community then Code • Mailing list discussions • Vote • Code Quality and Style • JIRA for each issue, feature • Merge Pull Request • Recruiting contributor/committer 19
  • 20.
    How to contribute? • Join mailing list: • dev@kylin.incubator.apache.org • Create JIRA or Leave Comments • Pull Request/Patch to Apache Github Mirror 20
  • 21.
    Graduate to Top Project 21 • Diversity • Complete (and sign off) tasks documented in the status file • Ensure suitability for project name and product name • Demonstrate ability to create Apache releases • Demonstrate community readiness • Ensure that mentors and the IPMC have no remaining issues
  • 22.
  • 23.
    Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 24.
    Build Community and Ecosystem • What’s community? • How to grow community? • Community than Code!
  • 25.
    Marketing -­‐ Website • http://kylin.io – Hosted on github.io (Github Pages) – Hosted on Apache Infra Server – http://kylin.incubator.apache.org
  • 26.
    Marketing -­‐ Blog • Publish via eBay Tech Blog to gain focus from industry • http://www.ebaytechblog.com/2014/10/20/announcing-­‐kylin-­‐extreme-­‐olap-­‐engine-­‐for-­‐big-­‐data "Like arch-­‐rival Amazon.com, the soon-­‐to-­‐split eBay Inc. is something of an oddity in that it hasn’t historically been a big contributor to the open-­‐source community. But the e-­‐ commerce pioneer hopes to change that with the release of the source-­‐code for a homegrown online analytics processing (OLAP) engine that promises to speed up Hadoop while also making it more accessible to everyday enterprise users." -­‐-­‐ siliconangle.com
  • 27.
    Marketing – Social Media • Github • KylinOLAP • Twitter – @ApacheKylin • HackNews • Facebook – Page: kylin.io • LinkedIn – Group: Kylin • WeChat(微信) – ApacheKylin • ...
  • 28.
    Marketing -­‐ Media • InfoQ • CSDN • OSChina • ... 28
  • 29.
  • 30.
    Build Community – Meetup • Hive Meetup Bay Area, Dec 2014 • Apache Kylin Meetup Bay Area, Dec 2014 • Apache Kylin Tech Talk @AWS Seattle, Dec 2014 • Apache Kylin Meetup Beijing, Dec 2014 • Spark Meetup Bay Area, March 2015 • Kylin Meetup in China, coming soon • ...
  • 31.
    • Big Data Summit Shanghai, Oct 2014 • Big Data Technology Conference Beijing, Dec 2014 • Database Technology Conference Beijing, April 2015 • Hadoop Summit Europe, April 2015 • QCon Beijing, April 2015 • Strata+Hadoop World London, May 2015 • HBaseCon San Francisco, May 2015 • Hadoop Summit San Jose, June 2015 • ... Build Community – Conference
  • 32.
    Know your community • Google Analytics • Github Statistics • Mailing List • WeChat • ...
  • 33.
    Apache Kylin Ecosystem Kylin OLAP Core Extension ! Security ! Redis Storage ! Spark Engine ! Docker Interface ! Web Console ! Customized BI ! Ambari/Hue Plugin Integration ! ODBC Driver ! ETL ! Drill ! SparkSQL • Kylin Core • Fundamental framework of Kylin OLAP Engine •Extension – Plugins to support for additional functions and features •Integration – Lifecycle Management Support to integrate with other applications like BI tools •Interface – Allows for third party users to build more features via user-interface atop Kylin core
  • 34.
    Apache Kylin Evolution Roadmap 2015%2014%2013% Ini$al% Prototype. for.MOLAP. • Basic.end.to.end. POC. . MOLAP. • Incremental. Refresh. • ANSI.SQL. • ODBC.Driver. • Web.GUI. • ACL. • Open.Source% HOLAP. • Streaming.OLAP. • JDBC.Driver. • New.GUI. • Excel.Support. • SparkSQL. • ....more. % . Next.Gen. • Lambda.Arch. • Automa$on. • Capacity. Management. • InNMemory. Analysis.(TBD). • Spark.(TBD). • Mobile.(TBD). • ....more. TBD. Future...% Sep,%2013% Jan,%2014% Sep,%2014% H1,%2015%
  • 35.
    Excellence of Engineering Recruit best people Done is better than perfect Do academic research Explain design in simple words Everyone does dirty work You write first version, I write second one Debate, Decision & Delivery 35 Team Philosophy
  • 36.
    Agenda • About Apache Kylin • Kylin Open Source Journey • Apache Incubating • Build Community and Ecosystem • The Good, The Bad and The Ugly • Q&A
  • 37.
    • 知名度 • 个人人成⻓长 • 团队文文化 • 项⺫目目质量 • 成就感 • 和牛牛人人做邻居 全世界都在注视着你和你的代码! The Good 37
  • 38.
    The Bad • 开发效率降低 • 内部项⺫目目进度vs外部支支持和问题 • 业余时间 • Roadmap and Features from external 38
  • 39.
    The Ugly • 开源不等于免费 • 请尊重开源作者 • Ask question with right way 39
  • 40.
    If you want to go fast, go alone. If you want to go far, go together. !!African)Proverb)
  • 41.
    • Kylin Site: – http://kylin.incubator.apache.org – http://kylin.io • Twitter: – @ApacheKylin • WeChat(微信) – ApacheKylin Apache Kylin
  • 42.

AltStyle によって変換されたページ (->オリジナル) /