Menu

databases

Python Pandas vs SAS: Head to head data analysis (Part 1)

Like every other data scientist out there, one of the questions I asked myself recently is “what programming language or data analytics tools should I learn to become a good data scientist?” The answer to this probing question is as varied as the varieties of potential tools. Interestingly, same languages often emerge in the top 3 depending on which platform (LinkedIn, Indeed, StackOverflow, Reddit, etc) you got your data from. Recently, RJMetrics published a comprehensive article which I found […]

Python vs SAS: Computing summary statistics (Part 2)

I recently started a series of blog posts to share my work experiences using SAS and Python Pandas for Data Analysis. If you’re coming directly to this post, you can see my first post on Python Pandas vs SAS: head to head data analysis here » In this part two of the series, I will be using the very powerful Group-Apply-Combine feature in Python Pandas for computing summary statistics and showing the equivalence in SAS as well. Then I’ll […]

Four types of NoSQL Databases you should learn now

So you’ve heard about Mongodb, Couchdb, Cassandra, Riak, REDIS, Neojs, InfiniteGraph, Voldemort, IBM Cloudant as new classes of Databases but you’re not sure what the differences are and how they’re different from traditional SQL-type database that you learn several years ago in school. Well, you’re not alone. It’s been some time I worked with databases. Most of my recent works have been theoretical performance analysis on wireless networks, and solving mathematical optimization problems where I end […]