An Alchemy News API Library for Node.js

A few months ago, I worked on a project that required some natural language analysis of recent news articles from around the world. I decided to achieve this  using a third party natural language processing API. This led me to AlchemyAPI.

AlchemyAPI is an IBM company that provides twelve different semantic text analysis APIs that leverage state-of-the-art natural language processing techniques. One of the provided APIs is AlchemyData News. This API provides the ability to carry out targeted semantic search and analysis of recent news articles and blogs from around the world. For instance, using AlchemyData News, one can search forall positive news articles about Google published within the last 30 days.

AlchemyData News was exactly what I needed for my project. Since I was working with AngularJS and Node, my first instinct was to find a Node.js library for AlchemyData News. I felt this was necessary because AlchemyData News provides an extensive query language that I considered to be overkill for basic targeted search queries. I also found the query language to be quite verbose, although I could see why it was so, considering that such a service must support complex and flexible queries. Unfortunately,  at the time, I was only able to find a Node.js library for the more mature and popular AlchemyLanguage API which provides general text analysis services. I eventually resorted to making raw queries to AlchemyData News using the provided query language. However, in that moment, I decided to eventually get around to creating a simple Node.js library for AlchemyData News. My intention was to make it as similar as possible to the excellent Node.js API I had found for AlchemyLanguage.

I have recently completed a basic implementation of a Node.js library that simplifies some of the basic queries supported by the AlchemyData News API. There are a lot of additions and improvements that can be made to the library especially with regard to supporting some of the more flexible querying capabilities provided by AlchemyAPI. However, this basic version provides the ability to retrieve:

  • categorized news content by searching for news on topics you care about e.g. baseball, mobile phones, etc.
  • news content containing abstract concepts e.g. democracy, polygamy, etc.
  • news content containing specific keywords where keywords are terms explicitly mentioned in the article that are determined to be highly relevant to the subject matter of the news article.
  • news articles containing named entities e.g. proper nouns such as people, cities, companies, products, etc.
  • news articles based on positive, negative or neutral sentiment.

The primary purpose of this Node.js library is to simplify the construction of some simple queries supported by AlchemyData News. It works for some basic queries but is far from being a complete mirror of the capabilities provided by AlchemyData News. I encourage everyone who finds this useful to make changes, suggestions and contributions as necessary.

Documentation and source can be found at https://github.com/davidadamojr/alchemy-news-api.

A Diary of Solutions to Common Coding Interview Questions and Programming Puzzles

Over the past few months, I have tried to form a long-term habit of reading about and writing solutions to programming puzzles and common coding interview questions. In the process, I have implemented quite a number of solutions to many programming puzzles and compiled them into some sort of constantly updated “diary”. This “diary” is maintained and updated as a git repository on Github.

As at the time of this blog post, the repository contains 60 Python solutions to about 55 programming puzzles and common coding interview questions. Find the repository here:https://github.com/davidadamojr/diary_of_programming_puzzles.

Hacking the Tweet Stream with Cliptext

Sharing images that contain significant amounts of text is something quite a number of Twitter users already do. It also seems like a pretty convenient way to “hack the tweet stream” and avoid the 140-character limit.

Sharing excerpts from online articles is something I do quite often on Twitter.  These excerpts often exceed Twitter’s 140-character limit, so copying and pasting these pieces of text directly into Twitter’s status message box does not always work out as planned. When faced with this 140-character dilemma, one option is to copy these pieces of text into storming.me and share the generated image. But being the “lazy” person that I am, I find that process a little tedious.  Another option is to create a Marc Andreessen-style tweetstorm using an app like tweetstorm.io or WriteRack. Unfortunately, this option does not seem particularly suited to sharing a text excerpt especially seeing that one might want to make a short personal comment about the excerpt being shared. Besides, there is that dreadful copying and pasting that I’m trying very hard to avoid like a plague. Oh, how I eschew the arduous labor involved in copying and pasting.

I eventually decided to whip up a quick and simple solution for myself by hacking up something I called Cliptext.  Cliptext is a chrome extension/web app/Android app that converts text selections into images that can be automatically shared on Twitter. It started out as an excuse to create a chrome extension. Getting some insight into Chrome extensions was nice, but crafting something I knew I was actually going to use quite often was even more delightful.

I am nowhere near being some sort of Kryptonian super-programmer who knows how to do everything off the bat. I typically need a lot of consultation from multiple sources. My first step to conjuring up Cliptext was to read the chrome extension getting started guide  in order to get a feel for how chrome extensions are structured. Subsequently, I needed to figure out how to add items to Chrome’s context menu on text selections. This blog post by Paul Kinlan helped set me straight in that regard.

My starting point on the PHP  side of things was this Stack Overflow question where someone asked how to convert text to an image. Reading the answers to the StackOverflow question suggested that PHP’s GD/Image Processing library was the way, the truth and the life.

Not wanting to mess with PHP Curl, stream contexts and all that jazz, I decided to take advantage of some of the existing Twitter API wrappers/libraries. My first attempt was to use codebird-php, but using it felt like swimming in snake-infested muddy water and I got bitten a few times.  I guess I just could not get it working as quickly as I had hoped.  I decided to try TwitterOauth as an antidote to all that snake venom from using codebird-php. Pure heaven it was! You should try it sometime.

Cliptext did not require any complicated code gymnastics and is as simple as Chrome extensions and web apps get. The intentionally plain user interface is based on Skeleton. Putting these resources together, I eventually came up with a working version of Cliptext that met my requirements. Thanks to the various authors of the resources I used.

I have made the chrome extension available on the Chrome Web Store and the Google Play Store in the hope that it will be useful to someone. I have also provided the source code on my Github page. Maybe someone out there might be able to learn something from it, just as I have been able to learn from others.

Struggling to Keep Up; Rediscovering the Pomodoro Technique

Going through a Computer Science PhD program is a grueling ordeal. At this point, it is not about writing exams and doing homework. There is that great scary monster called RESEARCH that needs to be tamed. If there is only one thing I have learned thus far, it is that computer science research is quite complex and time-consuming.

I truly enjoy building and thinking about web and mobile applications. However, I have reason to believe that it is possible to be a stellar doctoral student in Software Engineering and be relatively ignorant of web and mobile applications development trends. Chances are that things like AngularJSGruntJade or Ionic have little to no business with your research. This is not to say that there are no doctoral students conducting research in web development tools and technologies. It is just that as a doctoral student in Computer Science, you are more likely to be dealing with things at a much lower level than a JavaScript framework or templating engine.

My research interests are in Software Engineering. As a doctoral student in this area, I do not necessarily get to spend a lot of my time sharpening my applications development skills even though I am actively carrying out Software Engineering research. Trying to stay aware of rapidly evolving web and mobile applications development trends and keep my applications development skills and project portfolio up to par while also making reasonable progress on my doctoral research has been a real struggle. I daresay this is not for the faint of heart. There are a thousand and one things to do, deadlines to meet and self-assigned goals to achieve. Having a PhD is nice, but I refuse to graduate with the degree and find out I have lost touch with practical and readily applicable industry tools and trends.

A few months ago, I came to the stark and sudden realization that I simply was not keeping up with neither my research nor advances in web and mobile applications development tech. Every Software Engineer knows how quickly the software development landscape evolves and how vital it is to stay up to date if one is to remain relevant. Not only was I struggling to find research direction, but I was also beginning to lose track of the myriad of tools, frameworks and patterns that are becoming de facto industry standards in this present day and age. I needed to do something about my apparent lack of productivity and effectiveness if I was to have any hope of being the kind of well-rounded computer science professional I strive to be.

Sometime early last year, while carrying out my regular “online stalking” activities, I stumbled on a number of posts by Adulfattah Popoola where he wrote about how he uses the Pomodoro Technique to improve his efficiency and productivity. Since I am one hell of a copycat, I decided to adopt the technique and structure my days around Pomodoros.

One fundamental assertion of the Pomodoro Technique is that multitasking and distractions are counter-productive. Many of us like to think we can multitask effectively. However, a focused assessment of just how much work we actually get done while “multitasking” compared to what we achieve if we focus on a single task during a single time stretch seems to suggest that humans are pitifully bad at multitasking.

The following is my basic understanding of the Pomodoro Technique and how I have chosen to use it.

  • A Pomodoro time block is typically 25 minutes, although you can choose to make it whatever length of time you please.
  • The Pomodoro Technique suggests that you focus on a single task during a time block and eliminate ALL distractions during that time. This means no social media, no phone calls, texting, eating, drinking, smoking, dying, NOTHING but the task at hand all through the entire stretch of a single time block.
  • After each 25 minute time block is a 5 minute break during which you can do whatever you want. During this break time, I would typically do a quick social media crawl, read some random article or take a walk to make sure my legs still function the way they should.
  • After each set of four 25-minute time blocks, you get a 15 minute break. I usually spend this time returning phone calls, text messages, clowning around and arguing with my colleagues in nearby research labs.

My days are typically split into two phases; research and teaching during the day, and side projects, reading and programming practice during late evenings and at night. I aim to put in 8 Pomodoros (just 4 hours of focused work) each day at the research lab working on my research, and 8 Pomodoros (another 4 hours of focused work) at home working on side projects, learning new programming tools/techniques and just generally “keeping up”. In total, that’s 16 Pomodoros (time blocks of 25 minutes) each day with pre-assigned tasks split across these time blocks. For instance, on a good evening, I currently spend 2 Pomodoros each (just 1 hour) on Android programming, JavaScript, algorithms/problem-solving and hybrid mobile applications development frameworks (I love Ionic). This configuration of tasks evolves over time as I choose to learn new things. I use an Android app called Clockwork Tomato to keep track of my time blocks.

What I find most amusing about all this is how ridiculous the actual number of hours seems to be. A mere 4 hours of work each day at the research lab seems like so little time, but you will be amazed by how difficult it can be to put in focused work for that amount of time. Also, finding 4 distraction-free hours does not always happen in one fell swoop as things get in the way and there are sometimes other things to attend to. So, squeezing out 8 Pomodoros (4 hours) sometimes requires being at work for 8 hours. A lot of the time, I am not able to complete my Pomodoro sets for the day seeing as there are times when the mind is willing but the body is weak. However, at the bare minimum, I try to complete one set of four Pomodoros each on research and side projects/interests each day. Struggling through eight Pomodoros in as many hours has a way of showing you just how little “real” work we actually get done throughout the typical work day. It is funny how life and distractions get in the way.

All this is not about working till I drop or trying to achieve mind-blowing feats of software engineering each day. It is about consistency; putting in little bite-sized steady amounts of work and practice each day without fail. In the last few months, I have made some strides that have left me a lot less devastated when I think of the state of my research and software development skills. I have been able to learn things that previously kept being eternally postponed. I believe I now have a vision and work plan for my research. Of course, I am not completely satisfied. Just a lot less sorry for myself. :)

There is lots of information on the Internet regarding the Pomodoro Technique and I understand there is even an entire book (which I have actually never read). I believe anyone who decides to use the technique needs to come up with his or her own adaptation that works best in his/her own unique circumstances. I can quite confidently say the technique has helped me keep up with the 1000 things that compete for my attention. I am definitely eager to see what things I am able to accomplish in the long run.

What productivity hacks and/or adaptations of the Pomodoro Technique have helped you keep up with overwhelming tasks and goals? Please share in the comments.

Python Implementation of TextRank (Github Repo)

Find it on Github: https://github.com/davidadamojr/TextRank

A few months ago, I wrote an implementation of “TextRank” in Python. TextRank is an algorithm for automatic keyword and sentence extraction (summarization) proposed by Rada Mihalcea and Paul Tarau in this paper. However, unlike the approach taken in the paper, this implementation uses Levenshtein Distance as the relation between text units.

This implementation carries out automatic keyword and sentence extraction on 10 articles gotten from http://theonion.com.

It achieves the following:

  • Generates a 100 word summary for each article
  • Number of keywords extracted is relative to the size of the text (a third of the number of nodes in the graph)
  • Adjacent keywords in the text are concatenated into keyphrases

Obviously, this algorithm is useful for automatically extracting relevant keywords and automatic summarization of a given body of text.

The implementation has the following dependencies:

Find it on Github: https://github.com/davidadamojr/TextRank