Microsoft AppHack

Yesterday, 11/9/2013, was the Microsoft Hackathon. The fact that this hackathon occurred just one week after the Intuit hackathon means that I can easily compare and contrast my experiences. I’ll start by talking about the nontechnical differences between the two (organizational, etc.), and then move into talking about the differences in my projects. I did not present at the Microsoft Hackathon.

Nontechnical:

Food-wise, the Microsoft Hackathon was slightly better. Intuit ran out of dinner, but Microsoft managed to keep it somewhat decent, even with certain individuals taking a lot of food back to their tables (thus causing a resource-allocation problem!).

Theme-wise, the Intuit Hackathon did a better job of communicating what they wanted. The theme was ‘As a student, I wish I could…’, which was fairly restrictive. For the Microsoft Hackathon, I felt that the only direction we got was that the goal was to make a windows App out of it. This is merely an observation of fact; there are benefits to both approaches. The first approach allows for a more direct competition (comparing apples to apples), and a comparison of different approaches to solving similar problems. The second approach allows for full creative license, and the opportunity to see some truly outrageous things.

As a side note, the restriction of the team members to one to two people by Microsoft really hurt this hackathon in my opinion. While people are very nice at UCSD, and willing to offer advice, part of having larger teams (3-4 people) allows for combining people of different backgrounds (perhaps one person that you know, and one person that you don’t), and allows for more interesting groups. The restriction to Windows platform hacks makes sense as this was a Microsoft AppHack, but I felt that this was a large problem, considering in the beginning even the provided Samsung tablets had problems, and the staff seemed rather confused about the whole scenario. Even though I think Microsoft had more technology problems overall, it was nice to see them handle the problems well. I was impressed with both aspects.

 

 

 

Intuit Hackathon

This previous weekend (11/2) I participated in the Hacking, Are you Intuit? UCSD hackathon. I was part of a three-person team (Myself, Vivek Iyer, and Abhijith Chitlur). We created an application for the Android platform, named BookBroker (Abhijith suggested the name).

The Application:

BookBroker is an application born from the idea that a student should have to only press a button to sell their textbook. I made an observation that students tended to sell books on Facebook groups (such as UCSD’s Free and For Sale), a phenomena that can also be observed at different Universities. The ability to sell books on Facebook has its advantages, such as the quick facilitated money transfer between students at University, removing the need to deal with hassles like shipping. Using Facebook organically also removes the need to have a separate login client, such that the only pain felt by the user is the need to download an android application.

The first screen of the application is a scanner application , to enable a student to grab one of their textbooks and scan it. The second screen of the application is a screen that shows the augmented data of the book scanned by the first screen. The second screen also has a button to allow a person to post to a Facebook group.

The Technology:

The Book-Scanning application on first screen is powered by scandit api, a scanner api that allows you to scan a barcode and return an ISBN number. Vivek was responsible for this part, so I only have a cursory understanding of what happened here.

The second screen’s Facebook connect feature is powered by Facebook’s OpenGraph API, and carried with it the problem of needing to authenticate with tokens. This is handled through the Facebook API and by remembering to encode all outgoing requests in an AsyncTask. The reason for this is because Android OS does not allow blocking on it’s UI thread, and without making an AsyncTask, the GET request would be waiting for a response on the UI thread, thus returning a denial exception. Careful use of the debugger was necessary to catch this error at around 3AM in the morning. The GET request to augment the data was fairly simple after that, as the GET request would cause the server to send a string of information that could be parsed to JSON using Jackson APIs. In this manner, the data could be accessed easily.

I worked on the data augmentation section, where we would query a server from the application with the ISBN number, and receive a set of information. The way this was implemented was through the express framework (node.js). This technology was chosen because it is relatively easy to code quickly, and our use case for it was such that it was simple to get started. Express was used primarily as a router. The android application hit a specific route with a parameter ISBN, and then express would be responsible for formulating the correct queries to hit the data store. The data store in this case was Apache SOLR, chosen primarily for it’s out-of-the-box http rest interface and minimal configuration requirements. While it probably isn’t the most efficient data store, it provides an excellent catch-all with rich lucene querying capability without writing a SQL query, which makes it easy to explain and relatively easy to use (simple GET request on query). This makes it an ideal data store for a hackathon, as all these features are open source, come straight out of the box, and simplify coding challenges by simply requiring us to provide a middle server that makes and serves http requests (i.e. no dealing with ORM).

The scraper that fed data into SOLR was also implemented in node.js using the cheerio module, which allows for quick and easy parsing of html sites. Cheerio is more or less based on a similar CSS-descriptor-style of parser, of the family of JSoup/BeautifulSoup modules. The scraping was relatively trivial for this part, and the main trouble was finding a site endpoint that would allow searches based on ISBN. Since this is not a general use case, I had to get a little more creative, and first scraped the UCSD site for section IDs. The IDs were then used to hit the bookstore website for ISBNs. In this manner, website data was made available for scraping. Since node.js is async with callbacks by nature, the quick hacky solution was to commit changes after every scrape (to make sure data is available), but this causes a very significant slowdown. As we only had ~4k documents to index, this was acceptable, but in further scaling of this application, a different solution should be found for this. An alternative solution would be to add all documents after each scrape, and then send an alternative signal to commit to the server. Both methods end with the same result, a properly-indexed SOLR blade waiting for connections.