It’s been a while since I’ve posted on my data&stuff (WordPress) blog. The reason is that I’ve come to the realisation that while R is extremely useful, Python is the future. I spent the past few weeks picking up Python: I did several online courses and I’m currently working on a simple project, the Kaggle Titanic Machine Learning problem.

As I worked on this project, I made two discoveries. First, Python is amazing for data science. The sklearn and pandas modules are pure brilliance. They make data cleaning and modelling so much easier. One of the excuses I used for not learning Python was that R provided better graphics. Lo and behold, I found out that ggplot was available in Python! Besides, matplotlib creates beautiful graphs too. I became thoroughly convinced that Python is a better language for data science.

Second, GitHub is superb for version control. I’m currently using it for the Kaggle Titanic project, and I’ve decided that GitHub will be the new home for my blog! Previously, I used RStudio’s interface to RMarkdown to write all my blog posts. I could write the main text and generate in-line graphics. I found this incredibly convenient. However, combining GitHub and Jupyter notebooks allow me to do the same in Python! Hence, having made the switch from R to Python, I intend to use GitHub to host my blog.

Moving ahead, you can expect more mini-projects, and more posts on data science techniques. I look forward to the exciting new adventures with Python and GitHub!