Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
Session Topics Data File Access API General Overview What is it? Why use it? Where to get it? Architecture Overview DFA API Functional Specification Technology to the Rescue! Code Overview DFA API Methods The DFAQuery Custom Tag – Putting the “I” in “API” Viewing a Sample Application Summary
DFA API General Overview: What is it? The Data File Access API: is a framework for storing, retrieving, and maintaining application data in the local file system is an Application Programming Interface for ColdFusion Developers building small to medium sized applications on the MX platform is an alternative to the traditional approach of storing data in a relational database management system (RDBMS) Allows developers to retrieve data as a ColdFusion Query or as XML using either SQL or XPATH!
DFA API General Overview: Why use it? The DFA API offers many benefits over traditional RDBMS solutions: platform neutrality portability performance XPath and XSL Support XML storage and retrieval allows for: Data and data-definition re-purposing Web Services Flash Remoting better flexibility and extensibility
DFA API General Overview: Where to Get It? The DFA API is available to Macromedia DevNet subscribers. It is part of DRK 3. Installing the DFA API from the DRK CD will install: ColdFusion Component API Custom Tag “wrapper” Sample Application Sample DFA API XML “skeleton” files An additional application for data entry and data table creation is available at
DFA API Architecture Overview: Functional Specification DFA API Goals: Create an Application Programming Interface to allow developers to query data stored as XML or CSV using familiar syntax (SQL) as well as perform other common SQL tasks such as adding, manipulating, and deleting data Small to medium sized application usage is recommended Performance must be comparable to that of applications that query an RDBMS Applications using the API should be able to be migrated from one machine to another with ease Cross-platform functionality Must be flexible and easy to extend
DFA API Architecture Overview: Technology to the Rescue ! All API functionality is housed in a single ColdFusion Component: Easy installation (just copy the cfc to the server!) Easily executed by any application on the server Public/private data and functionality API can easily be extended via component inheritence Developers could expose API functionality and data via Web Services API could be used to provide data to Flash Applications via Flash Remoting
DFA API Architecture Overview: Technology to the Rescue! (cont’d) XML is useful in the real world… Architecture uses XML to: Define data tables (table names, columns, etc.) Map location of data files to the data table definitions Define the actual data Each of the above (data table definitions, locations, and data) is stored in an XML file. Data table definitions and appings are loaded and stored in memory each as an XML DOM.Data table data is loaded into memory (and persisted) the first time the data is needed. CSV support simply means converting all CSV content to XML Storing this data as XML enables: Retrieval of data as XML Use of XPATH to retrieve data
DFA API Architecture Overview: Technology to the Rescue! (cont’d) Query of Queries saves the day! The API must support not only XPATH but also SQL SELECT Statements for data retrieval. Thanks to QoQ, adding support for SQL SELECT was easily accomplished by first converting the XML DOM in question to a ColdFusion Query variable and then executing the SQL against it in a Query of Queries If the data is to be returned as XML, the result set is converted back to an XML DOM before being returned. Performance is fantastic, performance is fantastic, performance is fantastic!!
DFA API Architecture Overview: Technology to the Rescue! (cont’d) Locking, Memory, and File System Access Misinformation dispelled – file I/O operations are actually very fast All required information and data operations are stored/performed first in persistent memory in order to boost performance Named locks are used for all data operations – Exclusive locks for all write operations to data tables in memory and on file Read-only locks for all other data operations
Code Review – Component Methods MethodDescription AddData Adds a row of data to a datatable addDataTableMapping Adds a datatable mapping to the mapping file addMasterDataTableDefinition Adds a datatable definition to the master definition file commitDataTable Saves a datatable currently in memory as an xml file DeleteData Deletes rows of data from a datatable DumpIt s a value then does a. Only used for debugging and should be commented out in production. parseDataDefinitionMappings Loads the 'datatable'-to-'data-file' mappings for an application parseDataTableDefinitions Parses the xml file containing all available data table definitions and creates an XML DOM of all data tables GetData Loads a datatable if doesn’t already exist, performs SQL or XPath searches, returns data as XML or a query (or as transformed xml). getDataTableDefinitionArray Returns an array of structures that define a datatable GetDef Returns a DOM of all available table definitions. Only used for debugging and should be commented out in production.
Code Review – Component Methods MethodDescription GetMap Returns the DOM of application specific table definition to directory/url mappings. Only used for debugging and should be commented out in production. GetNextID Returns next id for a datatable when performing an insert GetPubData Returns a DOM of all loaded datatables. Only used for debugging and should be commented out in production. GetTableLocation Retrieves datatable location from application data definition xml mapping DOM IsCSV Verifies whether or not a string is proper CSV format isDataTableIDUnique Validates whether or not a data table ID is not currently in use isDataTableInMemory Is a specific datatable in memory or able to be loaded into memory?
Code Review – Component Methods MethodDescription LoadCSV Create a datatable from CSV text or from a CSV file PathTypeOf Determine if a string is a absolute or relative path, or a URL removeDataTable Removes a datatable from memory UpdateData Modifies data in an existing datatable XmlToRS Converts a datatable xml DOM to a CFML query
The DFAQuery Custom Tag – Putting the “I” in “API” One goal of the DFA API was to make the API as easy as possible for developers to use in their applications. This was achieved by creating a custom tag “wrapper” to shield developers from the component and to allow them to use the tag syntax they are already familiar with.
The DFAQuery Custom Tag – Putting the “I” in “API” DFAQuery Tag Execution Modes ModeTag Operation Start Tag checks to see if there’s an active instance of the API. If so, it does nothing. If not, it checks to see if the API can be initialized and either creates an instance or throws an error. End Tag performs the requested API operation.
Viewing a Sample Application The rolodex application is a simple sample application that installs with the API. It uses the DFAQuery custom tag to interact with the API Component in order to allow users to query the application for rolodex members, view member details, and update member information. A second sample application (qbe.cfm) is also installed with the API. It is a query by example interface that shows how CSV text can be parsed into a data table, queried using SQL or XPATH, and deleted on the fly. Let’s briefly examine the underlying code and files used by these applications has a third sample application that uses nothing but direct component method calls in order to create an administrative interface for data entry and data table creation!!
Summary The Data File Access API – Serves as an example of how to “properly” architect a component based application as well as how to code an API that makes use of the best features of ColdFusion MX Offers developers an alternative to the traditional approach to storing and manipulating data in dynamic web applications. This alternative is easily extended to meet application specific needs, is not dependent on any RDBMS platform, and allows the data tier and it’s business logic to be distributed as part of the application – all without sacrificing performance.