Rensselaer Center for Open Source Software

Rob Russo

Finalizing details for the summer, Vijay, Peter, and Josh will contribute when possible. Rob will work on the server and Ylonka will get some DB work done. Good luck to all members of the team with their summ research and internships!

Kathryn Lovell

A large part of the Arduino Snap is reverse-engineering Google's Blockly project. This is proving especially challenging considering how Google tends to have convoluted and persnickety coding structures. If we ever hope to contribute our project to Google Blockly as a whole, we're going to have to be extremely meticulous. Because of this, I believe that this project is mostly teaching us about how to work on large, collaborative, and polished projects more than anything. Sure, it's also teaching us a fair amount about how Javascript (How about them anonymous functions?), but it's largely teaching us how to be thrown into a new workspace, a useful skill for any budding software engineer and something that can only be learned either in the industry or by working on open source.

Kathryn Lovell

A large part of google blockly's engine is run off of a class or structure of some sort called Blockly. So for instance, Python code is rendered using the code "Blockly.Python.workspaceToCode();." But for the life of us, we cannot find where Blockly is defined. Believe me I've tried. We managed to get C code rendering by running it through the structure for Dart, but way more digging needs to be done on the whole "Blockly" thing.

Mukkai Krishnamoorthy
Joseph Hitchcock

After messing around with it a decent amount, multithreading the crawler doesn't seem to be something that's going to happen this semester. I had the structure of what would need to happen logically laid out, using Python's built in thread-safe queues and everything. The get_friends function would put a start index for each page that needed to be crawled into a queue, and then create an explicitly stated amount of threads, each with access to this page queue. A calling function would be used to initiate each thread, and would continuously grab the next page off the queue, and call the worker function on that page which would do a similar process to what happened in the single-threaded get_friends function. Each worker function would push a list of ID's it got for that page onto a results queue it would have access to, and at the end the get_friends function would combine all lists in the results queue to get the final list.

However, I ran into a lot of issues with Selenium page loading that made threading very difficult to work with. In addition, the immediate cost of having multiple threads managing unique webdrivers was very noticeable, and just opening 4 windows all controlled by different class objects happened sequentially instead of at the same time like I thought they would, this could just be the way Selenium handles it or I could have messed something up, I don't know. Previously, getting Paul's 1,235 friends took about 18 seconds (just under 70 ID's every second), where as opening the windows needed for 4 threads to do a quarter of the work each takes about 10 seconds. Because of these reasons we've decided to continue using our crawler with the single-threaded method it was designed for. I've saved the work I did trying to get the multithreaded approach as a backup so that I can look back on it later on to see if I can learn anything from my mistakes.

  • Joseph
Previous Page Next Page