Merge: Automating Financial Aid Award Letters Stirling Crow, IT Services Specialist Kim Luu, Financial Aid Technical Analyst Jagruth Peddineni, IT Student Technical Specialist June 9 th, 2016
Requirements Demo Issues and Troubleshooting Solution Questions? Agenda
Mail Merge: – A process made popular by Microsoft Word that allows for a template to be populated by data from a list. – Quick Demo Terminology
Each year the Financial Aid department is responsible for sending out s/letters for the following: Athletic Award Renewals Athletic Award Reductions Athletic Award Cancellations Fin Aid prefers to send customized s with a customized PDF (that cannot be edited) that provides specific information to the recipient. 4 Requirements
Kim uses the following process each year: 5 Requirements Generates a list with items for the merge (Student ID, Name, Address, Award Info, etc…) Using list and a letter template, a very large Mail Merge document (with the merged letters) is created. Copy the student’s particular letter out of the Mail Merge document into a blank Word Document and save. Convert the Word Document to a PDF. Create a customized with student info. Attach customized PDF to customized . Send with PDF.
Fin Aid needs a way to do the following: Send s with a large number of customized fields. Send s with a large number of customized fields and… the needs to have an attached PDF letter that has a number of customized fields. Information in Banner (RUAMAIL) needs to be updated so that there is a record of the student being sent the and/or the attachment. It would be ideal if there was a way to document exactly what was sent to each student 6 Requirements
Available options: Banner Delivered ROREMAL job. 7 Requirements
Available options: Banner Delivered ROREMAL job. Advantages: You can send s to a large number of students. sent is logged via the RUAMAIL form. Limitation: The ROREMAL job only has a limited number of customizable fields. You can’t create a customized with more than 10 fields. You can’t send attachments. 8 Requirements
Available options: MS Merge: Link Advantages: You can send customized s. Limitation: There’s no clear way to add customized documents to the customized . There’s no way to update RUAMAIL for logging purposes. 9 Requirements
Goals: It should be a Web Application. It should do the following : Send customized s Send customized PDFs Update Banner. Create copies of everything it sends out and store the contents somewhere accessible. 10 Application Development
The user needs to be able to: Upload an template Upload Word Template. Upload a.csv list. Provide an subject. Provide the name for the attached PDF. Provide the code and aid year for the RUAMAIL logging. 11 Application Development
The Application must: Allow for uploading documents. Send s. Parse Word Documents. Created PDFs from Word Documents. Connect to and update a database. 12 Application Development
Development Options: PLSQL: This will not work. The libraries for parsing Word documents and and PDFs do not exist. GRAILS: Banner XE Self-Service applications are built and deployed with same technology. Java-based. Any open-source Java library (API) can be used. Stirling had experience building a demo application a few years ago. 13 Application Development
Technologies Used: Grails 2.5 WebLogic (for deployment) Spring – uploading files Spring Security – Authentication and Authorization Apache Commons – Sending s Apache POI - Parses Word Files and Creates PDFs 14 Application Development
Initial Demo: 15 Application Development
Initial Demo: 16 Application Development
Kim began to test using a real Financial Aid Letter. Demo To Do: Get documents to show demo. 17 Issues and Troubleshooting
Kim’s Test - Word Document: 18 Issues and Troubleshooting
Kim’s Test - CSV File: 19 Issues and Troubleshooting
Some of the Mail Merge Fields are not working! 20 Issues and Troubleshooting
Some of the Mail Merge Fields are not working! 21 Issues and Troubleshooting
Why aren’t the Mail Merge Fields working??! Stirling tried to update the code to use different delimiters instead of > That did not work. The problem isn’t the delimiters. 22 Issues and Troubleshooting
Jagruth began to debug. He discovered that the parser that brings in the XML and the information in the Word Document was splitting up the Variable Names! 23 Issues and Troubleshooting
FNAME is being parsed correctly: 24 Issues and Troubleshooting
So the program can appropriately do a find and replace: 25 Issues and Troubleshooting
LNAME is not being parsed! 26 Issues and Troubleshooting
FA_MAIL_DATE is also not being parsed correctly! 27 Issues and Troubleshooting
As a result the application is not able to do a find and replace on those fields (along with the others)! 28 Issues and Troubleshooting
Troubleshooting – Creating a Simpler Document: The parser must be having a hard time with the formatting. The XML must be too much for it to handle. What if we could make the XML a bit simpler for the parser? Stirling copied the text from the Word document into Notepad++ to strip away any formatting and then copied the text from Notepad++ into a new, clean Word document. 29 Issues and Troubleshooting
Troubleshooting – Creating a Simpler Document: 30 Issues and Troubleshooting
Troubleshooting – Creating a Simpler Document: 31 Issues and Troubleshooting
Troubleshooting – Creating a Simpler Document: 32 Issues and Troubleshooting
Troubleshooting – Creating a Simpler Document: Stirling tested and it worked. 33 Issues and Troubleshooting
Troubleshooting – Creating a Simpler Document: Kim began to add simple formatting with spaces and fonts and it began to break. 34 Issues and Troubleshooting
Troubleshooting – It’s NOT the Parser!! Jagruth began to dig deeper. Why was the parser messing up the XML? 35 Issues and Troubleshooting
Troubleshooting – It’s NOT the Parser!! We did some research on.docx files (Word files since 2007). We learned that these files are actually zip files. You can use an unzip utility to unzip these files. And when you do… Demo 03-UnzipWordDocDemo\CancelTestVersion2\word\document.xml 36 Issues and Troubleshooting
Troubleshooting – It’s NOT the Parser!! …you see that the XML is formatted strangely within the Word document itself! The parser is doing what it’s supposed to! The problem is with how MS Word saves its data. 37 Issues and Troubleshooting
What do we do now? How can we replace fields if the XML in the original Word Document splits up our variable names? 38 Issues and Troubleshooting
Options and Outcomes: Pre 2007 Word Documents (.doc). Can we parse these? Jagruth attempted to do this using the available APIs and it didn’t work. We saved the letter as a.doc, modified our code, and it didn’t work. He got errors. Parsing the.docx file to an HTML file? docx HTML docx PDF ? Two problems: The Find and Replace didn’t work (The weird format of the XML came over into the HTML) We lost page formatting with the HTML. 39 Issues and Troubleshooting
Options and Outcomes: Find and Replace within PDF? If we’re having so many issues with docx files, perhaps we can try to do a find and replace within a PDF file that we create? PDF files can’t be as difficult as Word files, right? docx PDF find and replace within PDF save PDF Jagruth looked into the following APIS: 40 Issues and Troubleshooting - Aspose – Using this API costs money. - iText – Not open source. This may have functionality but parts of it you probably have to buy. - pdfBox – This is an open source project supported by Apache! Surely this will have find and replace functionality!!!
Options and Outcomes: Find and Replace within PDF? The Following is from the pdfBox site: 41 Issues and Troubleshooting
Options and Outcomes: Find and Replace within PDF? Jagruth tried to use a previous version of pdfbox and used it’s find and replace functionality. It did not work. 42 Issues and Troubleshooting
Options and Outcomes: We can’t do “find and replaces” in PDFs. Word splits out our variables so we can’t do consistent find and replacing. What can we do? 43 Issues and Troubleshooting
Single Character Approach: Well, you can’t split a single character in half. What if you used a rare character (like an ampersand “&”) as an indicator for the application to replace a value? Not ideal – but… let’s give it a try. We updated the code that “finds and replaces” in the Word Document to go through the values in the csv file and replace them in order of the ampersand values found in the Word document. Demo 44 Issues and Troubleshooting
Single Character Approach: Word Document: 45 Issues and Troubleshooting
Single Character Approach: CSV file (fields must be in the order of the ampersands): 46 Issues and Troubleshooting
Single Character Approach: Demo 47 Issues and Troubleshooting
Single Character Approach: Results – it works! 48 Issues and Troubleshooting
Athletic award letters will be sent next week! Other Fin Aid letters will present challenges since they contain tables and other items that may not be formatted correctly. Testing has to happen. Possible fix using.doc files that still needs to be tried. 49 What’s Next
50 Questions? Stilring Crow – Kim Luu – Jagruth Peddineni –