What does Open Source Mean for HDF? Mike Folk The HDF Group July 2012 ESIP Summer Meeting 20121
About HDF ESIP Summer Meeting 20122
What is HDF? A data model – Structures for data organization and specification Open file format – Designed for high volume or complex data Open source software – Works with data in the format Today we just focus on software ESIP Summer Meeting 20123
HDF4 or HDF5? Both ESIP Summer Meeting 20124
The HDF Group At U of Illinois NCSA Non-profit company since 2006 About 35 staff and $3.5M in revenues ESIP Summer Meeting 20125
Mission of The HDF Group To provide the highest quality software for managing large complex data sets To provide outstanding services for users of HDF technologies To insure long-term access and usability of data that is stored using HDF technologies ESIP Summer Meeting 20126
HDF Communities Academia, government, commercial All applications involving complex or big data Users range from highly proficient software developers to naïve end users. ESIP Summer Meeting 20127
What we do Support some large diverse projects, such as EOS and JPSS – Whatever they need Work for hire – Training, consulting – Development of HDF core software General maintenance, QA and support ESIP Summer Meeting 20128
Distribution of revenues by sector 9ESIP Summer Meeting 2012
What does OSS mean for HDF ESIP Summer Meeting
History of HDF as OSS Why HDF became FOSS – Default Why HDF stayed FOSS despite objections – Not very monetizable – Universal access to data Why HDF should remain FOSS – HDF preservation mission – All of the above ESIP Summer Meeting
Community- Maintained: Single Maintainer: GOTS COTS Proprietary OSS Open GOTS Closed GOTS Single Maintainer OSS Community Maintained OSS Gated SW Typical proprietary SW Open Technology Development* * Based on slide 41 from Open Source Software (OSS or FLOSS), the U.S. Department of Defense (DoD), and NASA, David A Wheeler, NASA Open Source Summit, March ESIP Summer Meeting
Intellectual property U of I original owner Transferred to HDF Group for royalty on commercial profits BSD license ESIP Summer Meeting
Benefits of OSS, as it relates to HDF. ESIP Summer Meeting
TRY BEFORE ADOPT ESIP Summer Meeting
IF IT ALMOST WORKS, YOU CAN MODIFY IT TO MAKE IT WORK ESIP Summer Meeting
DEVELOPMENT ACTIVITIES ARE PUBLIC ESIP Summer Meeting
FREEDOM TO DEVELOP TOOLS THAT MAKE HDF MORE USABLE ESIP Summer Meeting
LONG TERM ACCESS ESIP Summer Meeting
Aspects of OSS were less sure about, as they relate to HDF ESIP Summer Meeting
UNPAID CONTRIBUTORS CAN DO MUCH CORE WORK ESIP Summer Meeting
"GIVEN ENOUGH EYEBALLS, ALL BUGS ARE SHALLOW". ESIP Summer Meeting
FREQUENT DEVELOPMENT CYCLES ARE GOOD ESIP Summer Meeting
OSS IS EASY TO USE ESIP Summer Meeting
OSS IS LOW COST ESIP Summer Meeting
IT IS EASY TO RUN A COMMUNITY- BASED OSS PROJECT ESIP Summer Meeting
OSS BUSINESS MODELS ESIP Summer Meeting
What next? ESIP Summer Meeting
INCREASE AND DIVERSIFY REVENUE TO OFFER A BETTER PRODUCT ESIP Summer Meeting
BETTER USE OF COMMUNITY! ESIP Summer Meeting
Thank you ESIP Summer Meeting
ESIP Summer Meeting
Backup slides and slides I decided not to use (hidden) ESIP Summer Meeting
Collaboration How are we involved with the community? How are we not involved?Comment 1.Getting expertise we dont have 2.Asking for input on documentation 3.Testing on misc. platforms and with compilers we dont have 1.Identifying persons who can work with us 2.Implementing features 3.Reviewing code 4.Writing tests 5.Writing tools and wrappers for other languages To improve we need better internal docs and coding standards (library, wrappers, tests, tools) ESIP Summer Meeting
Contributions from community How are we involved with the community? How are we not involved?Comment 1.Bug reports 2.Suggestions for feature enhancements 3.Feedback on services and products 1.Accepting patches (we usually rewrite or rework them) This is the most successful area especially bug reports and features request ESIP Summer Meeting
Decision making/future direction of the software How are we involved with the community? How are we not involved?Comment 1.We make all decisions about where the current efforts go ESIP Summer Meeting
Management of code, issues, development, releases How are we involved with the community? How are we not involved?Comment 1.We manage source code with SVN 2.We manage and prioritize all issues 3.We manage development 4.We decide time and features for new releases 1.Read-only permissions; obscure, not visible 2.No access to JIRA and internal development pages as Confluence 3.No contribution at all 4.No contribution at all. This is a risk if company disappears ESIP Summer Meeting
Information sharing How are we involved with the community? How are we not involved?Comment 1.HDF-FORUM 2.We support documentation in non-standard way (automatic generation) 1.No defined mechanism for sharing info with community (Wiki, Web, FTP are confusing and disorganized) 2.No easy and standard way to generate documentation This is a long-standing issue. ESIP Summer Meeting
Support How are we involved with the community? How are we not involved?Comment 1.HelpDesk, HDF-FORUM, Tutorials at the Workshop 2.We try to point users to software based on HDF5 1.Users meetings, web- seminars 2.User conferences 3.Acknowledgement of best contributors, best products based on HDF5 ESIP Summer Meeting
ActivityHow are we involved with the community? How are we not involved?Comment Decision making/future direction of the software 1.We make all decisions where the current efforts go 1.Code management 2.Issues management 3.Development management 4.Release management 1.We manage source code SVN 2.We manage all reports 3.Done by us 4.Done by us (we define time and features for the new release) 1.Read-only permissions; obscure, not visible 2.No access to JIRA and internal development pages as Confluence 3.No contribution at all 4.No contribution at all. This is a risk if company disappears 1.Information sharing 2.Documentation 1.We support HDF- FORUM 2.We support documentation in non- standard way (automatic generation) No defined mechanism for sharing info with community (Wiki, Web, FTP, too confusing and disorganized) Providing easy and standard way to generate documentation This is a long-standing issue. SupportHelpDesk, HDF-FORUM, Tutorials at the Workshop We are trying to point users to software based on HDF5 1.Users meetings, web- seminars 2.User conferences 3.Acknowledgement of best contributors, best products based on HDF5 4. ESIP Summer Meeting
The HDF Group HDF project started in 1987 at NCSA/U of I – Apache and Mozilla have the same roots) The HDF Group started in 2006 – Non-profit in Champaign IL – 35+ staff and $3.5 in revenue ESIP Summer Meeting
Our goal is not to promote OSS but to exploit OSS We have a long way to go ESIP Summer Meeting
Members of the HDF support community ESIP Summer Meeting
The HDF Group ESIP Summer Meeting
The HDF Group Services Helpdesk and Mailing Lists Standard Support Consulting Training Enterprise Support Special Projects ESIP Summer Meeting
Is HDF software really free? As in free beer? – Yes, the core software is As defined by the FSF*? – run it for any purpose – study and change it – redistribute copies – distribute copies of modified versions – Yes, but Changed versions may not create valid files szip restriction * Free Software Foundation ESIP Summer Meeting
EASE OF USE (FOR TECHIES AND VISIONARIES) ESIP Summer Meeting
WISDOM OF THE CROWD ESIP Summer Meeting