Summary of Reading (June ’20)

Rafa (Rafael Nadal) – I’ve never really been that engrossed with tennis so reading this book was an insight into a world I’ve never really been a part of, but it was fun nonetheless. Major parts of the book seemed like they were devoted to more tennis-tuned readers, given that there’s a lot on game specifics and a lot of tennis terminology that went flying over my head. I found it really insightful to read about the impact his social and mental stability had on his physical stability and his game. I especially found the sections on the the importance of family to Nadal, and how his circle of stability helps him. It was also inspiring to read about having such a concrete routine that you’re at the court of 5 am no matter what and no matter how much sleep you were able to get. He talks a lot about having a very focused mental state during the game, and filtering everything out but the game so that you have very high concentration, and I relate to that not just because I’ve played cricket in the past, but because that state of mind where nothing else can impact you and all your mind is bent on particular task is something I try to attain and do attain while programming or reading. The sections on humility are also wonderful to read, but also mildly humorous sometimes, especially the parts where he talks about how his uncle and coach, Toni, makes him perform small, irrelevant gestures like not walking in the middle of the group and not violating dress codes etc. I also enjoyed reading about Mallorca and the societal setup there, the amount of peace Nadal gets is incredibly valuable given not many athletes can afford that in the modern world, and seeing him talk about how important a factor that is is brilliant and wholly justified. This book deviated from my usual sci-fi adventures, but I enjoyed it but found the sections on describing games for long periods of time a bit too much.

The Three Body Problem (Cixin Liu) An incredible book that brings together two vastly different story-lines in a manner I am yet to see. The book starts out in mundane manner, throwing you straight into the action wherein the antagonist’s father dies, forming the motivation for an act we see quite a while later. Everything that occurs at the starts: the countdown, weird results from experiments, the universe flashing, seem to be pure sci-fi elements that the author eithers fails to or doesn’t explain. The beauty of this book lies in turning that around into something that is rational and believable. Even more so, it’s amazing to see the Three Body video game turn into something that exists in the real world. Of course, abandoning the question of how life even came to be on such a world, it’s fascinating to see alternate theories play out as to how the environment functions, and see hundreds of years of scientific progress occur in the span of a few pages. The different scenarios that play out: solar trygzys, triple sun days, chaotic eras and so on make this book incredibly fascinating. Add to that the depth to which all characters have been created; we have Wang who is oblivious to the grand scheme of things but a good man at his core, then we have Dang Shi who is oblivious to the big picture and not a man of science, but the guy who always solves the problem through his core set of principles, and lastly we have Ye, an almost psychopathic personality with a deep hatred for humankind, such that she’s effectively brought and end to it. Learning about the 3 body problem was an incredible experience for me. I also loved the author’s note about how most ideas that take off fall back to ground because of how the gravity of reality is too strong.

Delta-V (Daniel Suarez) – Very much new, and very relevant. While some of the things in the book were downright outlandish, how Suarez portrayed everything and how real he made the dangers of space travel seem to be made this book amazing to read. The candidate smelection process had a lot of time and space devoted to it, and it payed off with it being one of the best pre-climax sections I’ve personally read. In fact, I’d say some of the stuff in the entire candidate selection section was better than the rest of the book and the climax. It was really good at hooking me on. Some stuff that stood out during the candidate selection process: the high-co2 atmosphere puzzle solving event, the psychological test, and most importantly, how bonds were being formed. When the actual crew went up to hotel LEO to pursue other projects, I thought that the book was dying down, and I personally vouched for the chosen crew dying in their first attempt and the actual crew substituting for them, but what Saurez did was much better, although admittedly a bit rushed. I also couldn’t quite comprehend why a spaceship that had 14 billion dollars invested in it had a lot of software errors that made living unbearable and potentially fatal for the crew. I mean, come on, if you have 14 billion dollars, surely you can have a nice programming team as well who don’t mess their job up. The entire concept of mining was made so much better by the sample schematics attached by Saurez towards the end, they really payed off. The arrival of the Argo was also surprising, and Joyce’s downfall was also kind of expected but tragic nonetheless. The worst part of the book, in my opinion, was how the new investors treated the astroanuts. That seemed like pure fiction, rather than the rest of the book, which was genuinely believable and inspiring. This book also hammered into me the sort of challenges future astronauts can face, and the sacrifices they might have to make. It really drove the point home; space is hard.

Ender’s Game (Orson Scott Card) – Pretty good read. What really held me back was the sheer outlandishness of a boy of eleven years of age leading an entire army and being representative of the Earth’s military WHILE not even realizing he was doing this. I know its purely fictional, and the story is quite impressive, but the entire notion of such a young person doing all that is beyond me. Another thing I did not understand was why Mazer Rackham didn’t lead the armies. He told Ender that he wouldn’t be alive until the date of the future battle, when instead that same battle was Ender’s training. The simple explanation for this is that Ender is better than Mazer. That being said, no matter how good Ender is, why would the authorities choose a 11 year old boy with an incredible skill set over an experienced veteran and celebrated hero? That doesn’t make any sense to me. I love the elegance in how Ender is compassionate at heart and Peter is more hurtful, but how events reverse roles, making Ender seem hurtful and Peter seem compassionate. Throughout the book, Ender’s desire to not be like Peter holds him back and dominates him, whereas Peter’s desire to be more compassionate drives him to greater heights. It’s a stunning depiction of how events in the real world can pan out, and how roles can be reversed even when characters are not. In very cliche fashion, I’m going to say that the character I bonded the most with was the main character, Ender, but only because of his tendency to look at things the way they were and not the way they were conventionally taken to be. The best example of this is how each team thought of the battleroom as horizontally aligned, but Ender was the only one who viewed it as a place where you’re going down towards your enemy and not straight, and that changed a lot. His tendency to tackle the rules is also brought out by the match against two teams, where he sends a man through the gate rather than eliminating the two teams. The ending is a bit heartbreaking, especially when Ender realizes that he didn’t just obliterate the opposite side which had its own culture and legacy, but also sacrificed soldiers on his side without knowing so, and seeing how the guilt of this plays out is fascinating.

Recursion (Blake Crouch) – A book that takes science and throws it out the front door. The reasoning that time has no linearity, and that the past, the present, and the future exist as one just makes no sense. The simplest way to gauge time is drink tea; your cup of tea cools down gradually, that’s the progression of time. However, time isn’t a thing in this book. Instead, it’s a virtual construct made by our brains to make everything much more simple. And guess what this means, you can travel back and forth, but not in time, in your memories. Now, in normal conditions, this would be time travel, but since time apparently does not exist in this book, or is as traversable as a physical dimension, this is just memory travel. The part that struck me the most was how Helena and Barry were not able to figure out how to nullify everything after the original timeline in what was about 198 years, when Slade did that on his own in just a single timeline. Both of them thought about it so much that Barry, who was a police detective in the first timeline, became an astrophysicist/quantum physicist in the second and was trying to calculate the Schwarzschild radius of a memory (what?). Both of them didn’t think about returning to the previous timeline by activating a dead memory. It’s only logicial that if you can time travel, the only way to nullify your existing timeline is to go back and cut out the event that birthed your existing timeline. Instead, Barry and Helina decided to try and find a way to remove dead memories themselves rather than the events that created those dead memories, had they decided to try and pursue the latter in 198 years, they would’ve succeeded. I also don’t get why people need to be killed to be sent back into their own timeline, or why the U-shaped building building just appeared and caused mass FMS rather than it being there forever and people getting FMS when the cut-off date finally came.

Currently Reading

The Amazing Adventures of Kavalier and Clay (Michael Chabon)

The Brain (David Eagleman)

Implementing Hierarchal Clustering (Python)

Clustering is an important non-supervised learning technique. It aims to split data into certain clusters. For instance, if you input data pertaining to shoppers in the local grocery market, clustering that could output age based clusters of say < 12 years, 12-18, 18-60, and > 60. Another intuitive example is banking, clustering financial data for a large group of individuals could output income-based clusters, say 3 pertaining to the lower middle, upper middle, and upper classes.

The most basic and intuitive method of clustering is K-means, which identifies K clusters. It randomly initialises K centroids, marks the points near it, and repeats until it repeats K averaged out cluster centroids. Hierarchal analysis, however, is based on a much different principle.

There are two methods of hierarchal analysis: agglomerative and divisive. Agglomerative puts each data point into a cluster of its own. Hence, if you input 6000 points, you start out with 600 clusters. It then clusters the closest points (and then clusters) and repeats this process until there is only one giant cluster encompassing the entire dataset left. Divisive does the exact opposite. It starts with one giant clusters, and splits each cluster (repeatedly) until each point is its own cluster. The results are plotted on a special plot, called a dendrogram. The longest vertical line segment on the dendrogram gets to be the optimum number of clusters for analysis. This will be much easier to understand when I show a dendrogram below.

Here is a link to the dataset I’ve used. You can access the full code here. This tutorial on analyticsvidhya was also immensely helpful to me when understanding how hierarchal clustering works.

Note down the libraries I’ve imported. The dataset is fairly straightforward. You have an ID for each customer, and financial data corresponding to that ID. You’ll notice that the standard deviation or range for features is quite different. Where balance frequency tends to stay close to 1 for each ID, account balance is wildly different. This can cause issues during clustering, so that’s why i’ve scaled my data so that each feature is similar to each other feature in relative terms.

Here is the dendrogram for the data. The y-axis represents the ‘closeness’ of each individual data-point/cluster. You’d obviously expect y to be maxed out when there’s only 1 cluster, so that’s no surprise. Now, looking at this graph, we must select the number of clusters for our model. A general rule of thumb here is to take the number of clusters pertaining to the longest vertical line visible here. The distance of each vertical line from each other represents how faraway those clusters are. Hence, you want a small number of clusters (not necessary, but in this application, optimal), but also want your clusters to be spaced far apart, so that they clearly represent different groups of people (in this context).

I’m taking 3 clusters, which corresponds to a vertical axis value of 23. 3 clusters also intuitively makes sense to me as any customer can broadly be classified into lower, middle, and upper class. Of course, there are subdivisions inside these 3 broad categories too, and you might argue that the lower class wouldn’t even be represented here, so we can say that these 3 clusters correspond to the lower middle, upper middle, and upper classes.

Here is a diagrammatic representation of what I’ve chosen.

After building the model, all that’s left is visualising the results. There are more than two features, so I’m arbitrarily selecting two, plotting all points using those features for my axes, and giving each point a color that corresponds to all other points in its cluster.

You’ll notice that there is a lot of data here, but also a clear pattern. Points belonging to the purple cluster visibly tend towards the upper left corner. Similarly, points in the teal cluster tend to the bottom left corner, and points in the yellow cluster tend to the bottom right corner.

Implementing PCA and UMAP in Python

You can find the full code for PCA here, and the full code for UMAP here.

Dimensionality reduction is an important part of constructing Machine Learning models. Dimensionality Reduction is basically the process of combining multiple features into a smaller number of features. Features that have a higher contribution to the target value have a greater representation in the final combined feature than features that contribute less. For instance, if you have 8 features, the first 6 of which have a summed contribution of around 95%, and the last 2 of which have a contribution of about only 5%, then those 6 features will have a greater representation in the final combined feature. In terms of advantages, the most significant is less memory storage and hence higher modeling and processing speed. Other advantages include simplicity and easier visualization. For instance, you can easily plot the contribution of two combined features to the target, especially compared to plotting, say, 20 initial features. Another significant aspect is that features will less contribution that would otherwise add useless ‘weight’ to the model are removed early on.

The two methods of dimensionality reduction I will be using are PCA and UMAP. I won’t be going in through how they work as I’ve given a short overview of their purpose above. Instead, I’ll go through the code I implemented for each, and visualize the results. For this exercise, I’m using the WHO Life Expectancy Dataset that can be found on Kaggle, as its very small and easy to work with. My target variable will be life expectancy, and my features will be aspects like adult mortality, schooling, GDP etc. I randomly selected these features from the dataset.

Here is a list of the modules we will be using. train_test_split will help us break out data into a training set and a testing set (about a 7:3 ratio). While this isn’t significant right now, this aids in the detection of under fitting and over fitting. Under-fitting is detected by bad performances on both the training set and the testing set, whereas Over-fitting is detected by really good performance on the training set but bad performance on the testing set. StandardScaler has been used to normalise features. Feature normalisation is a technique that reduces the range of the dataset, or the standard deviation, in layman’s terms. Lastly, we’ve imported both PCA and UMAP, which will be used.

Here we just load our dataset, extract the features that will be used (see column names in the dataframe), and rename them for the sake of simplicity. As you can see, there are some random spaces and not all use underscores as notation, so I decided to have one uniform way of typing out each feature. Now, to extract a feature matrix and a target vector, just drop the life_expectancy column from the dataframe and convert it into a numpy array, and convert the life_expectancy column into a separate numpy array. I won’t show the code for splitting and normalising, because that’s pretty much irrelevant here.

Implementing PCA in itself is very simple, as shown above. You’ll notice that I’ve specified n_components to be equal to 2 above. This is because I just wanted to point out that the number of combined features you want at the end can be set by you. In this case, it doesn’t really matter because PCA will give only two combined features if I do not a specify a pre-set number. After that, I’ve fitted the training_data to PCA.

Here’s a bit of data treatment before I finally plot the results. I’ve basically converted PCA’s output, which was a numpy array, to a pandas dataframe, and then added life_expectancy as a column because that will be used for the color-bar you will see below.

Here is the code for my plot, and here is the plot:

You can see the relative contribution of each componenent to each feature, whose target value, or life expectancy, is represented by the color of the marker. While I don’t see any patterns straightaway (specific colors being clustered somewhere etc.), the primary thing that does stand out is how heavily green dots (~ 70 expectancy) are clustered towards the bottom left. There are other colors as well, but there don’t seem to be many green dots anywhere else.

The code for UMAP is the exact same, except with UMAP as our decomposer instead of PCA. Here’s the plot.

You can straightaway see that the results of UMAP are quite different. Once again, there are no noticeable patterns in terms of specific colors being clustered in specific locations, but the overall structure is quite different from that of PCA. We can see that each color is distributed throughout.

There’s no way to say which method is better without modeling your target variable with respect to both principal components and calculating the accuracy on the testing set. This post just aims to illustrate how both of them work without going into specific details.

Sustainability: Why it Matters and What I’m Trying to do about it.

There are so many facets to global warming and climate change people are aware about, but don’t understand. This leads to a lot of focus on very specific issues, which, in turn, leads to negligence of other problems. For instance, one such facet is decreasing tree cover in urban areas and loss of forested areas. This problem is very simple to understand. People need space to live, and they need shelter to live in, hence, they take a plot of land with trees, cut the trees down to get space and materials for building those shelters. Sustainability here would be planting more saplings elsewhere and restricting yourself to the plot you cleared out initially. If, with time, you lack space, build vertically, not horizontally. Leave room for mother nature. Instead, due to an ever-increasing need for residential and industrial areas, cities continue to expand at a breathtaking pace. Find an image comparing what cities were like ten years ago and what they are like now visually, you’ll see what I’m talking about. Urban areas constantly demand more homes, offices, power generation facilities, shopping malls, roads etc. The list goes on and on. Nature doesn’t. Due to urban expansion and hence reduced green cover, carbon dioxide levels are ramped up with more vehicles to produce it and less trees to remove it. Describing the different ways through which carbon dioxide is produced is futile. More urban areas results in more airports, not just one for each city, but more than one for the large cities. Airports need vast patches of land, which means less tree cover. More airports means more flights, and aircraft are one of the biggest singular producer of greenhouse gases, so that just keeps adding on. This is just one impact of not having enough green cover, some other consequences include higher temperatures (hence more need for air conditioners in buildings and vehicles, which has redundant unit-impact, but a very large aggregate impact), soil erosion, elimination of natural flood barriers etc. It’s bad. You might say that cutting down a small patch of trees has no impact in the larger scheme. You’re right, it doesn’t. But when millions of people say that, that impact adds up, and it leads to the world we’re in today.

Now less green cover is just one facet of the problem, as I said at the start. You might’ve realized the magnitude of this problem, but there are other problems out there that bear the same, if not heavier consequences. There’s rising temperatures, rising sea levels, desertification etc. And these are just other aspects of the same overarching problem: global warming. There are a host of issues out there: space junk, war, plastic pollution etc.

But then, the reason why one can’t solve all these issues is because they’re too large. That doesn’t just mean each one needs too much effort and time, it means that each one needs too much money, on the scale of billions of dollars. Knowing this, I set out to try and make my own small-scale, high-impact solution to the problem of reduced tree numbers: Treephillia.

Planting trees is something we’re all taught as a kid. Kindergarten and elementary school are full of activities where PT teachers show you how to plant a sapling, then you go ahead and plant yours and it feels like you’re doing your bit for the world. It’s something we all read about, and I do believe that the mass media deserves a lot of credit for showing the world just how important planting trees can be. In the developed world, and in a significant proportion of the developing world, most people know about this issue, even though they may not do anything about it or not even care. Awareness is there, but no one knows what to do. Take person X, for example. X wants to contribute by planting trees, but he has no idea about what to plant or how to plant it. He doesn’t know where he should plant it, and he doesn’t know which saplings he should buy from where. X is also aware of how important planting trees is, but is unsure about the impact just one plantation can have. X is representative of the majority of Earth’s population in this matter.

My application tries to remedy these issues. With inputs from experts who have field experience, I can advertise to its users what they should plant. For instance, the Eucalyptus is a 100% no-no, it might look pretty but it stunts the growth of other trees. With a plantation site feature that enables officials within the local forestry department to mark spots ripe for public plantation campaigns, I can tell X where to go to plant his tree. With a map that enables X to see plantations, he gets to know the larger impact. If you have a city with a population of 1 million people, and 1 out of 100 plant just a single sapling, you still have 10000 plantations. If you have a country with a population of 100 million people, via an extension of the same assumption 1/100 plant, you have 1 million plantations. That’s a lot, and that would really matter. My application gives its user the personal interface they need to plant trees smartly in the modern day era. And no, it’s not just limited to the stuff I listed out above. There is a serious lack of incentive surrounding planting trees, so I also decided to implement a voucher feature that rewards users who plant trees. While a voucher should not be a reason to want to plant trees, it serves as a pathway to doing so, and that works.

Now, how exactly does my application in reality? It’s not just for singular users who want more information and who want to record what they actually do. For instance, businesses can use it to have each employee plant a tree in the event of an office birthday and track the trees planted on a map. People can plant a tree on the important occasions of their life, say birthdays, anniversaries etc. Hotels can get employees to plant a tree when a guest has a birthday, or maybe even plant one without any occasion to visually improve the setting their guests stay in. The government can use it to track tree plantations and mark planting sites for the public. NGOs in this sector can use this to reach people who genuinely care and want to have an impact on the world through planting trees.

Personally, tree plantation is very important. As a 7 year old, I helped Dad plant a sapling outside our old house. For four years, I saw that sapling grow from a timid little thing to a leafy tree. Every time I return to that house, I think about just how much that one tree has grown. With this in mind, I’ve always drawn an analogy between planting a sapling and a mass tree plantation movement. Like a sapling, any plantation trend would be small at first. But, with time, it’ll grow. It’ll sprout branches and twigs, wear leaves, and most importantly, grow roots deep enough to keep it stable and strong. However, there is one difference here. Unlike a tree, a movement can’t be ripped out of its foundations or chopped off from its base. It will persist, and so will our planet.

Plotting Shapefile Data Using Geopandas, Bokeh and Streamlit in Python.

I was recently introduced to geospatial data in python. It’s represented in .shp files, in the same way any other form of data is represented in say .csv files. However, each line in a .shp file corresponds to either a polygon, a line, or a point. A polygon can represents certain shapes, so in the given context of maps and geospatial data, a polygon could act as a country, or perhaps an ocean and so on. Lines are used to represent boundaries, roads, railway lines etc. Points are used to represent cities, landmarks, features of interest etc. This was very much new to me, so I found it fascinating to see how any sort of map can be broken down into polygons, lines, and points then played around with using code. I was also introduced to streamlit, which provides, say, an alternative to Jupyter Notebooks but with more interaction and in my opinion, better visual appeal. I think one distinct advantage Jupyter Notebook has is compartmentalisation, and how good code and markdown look next to each other, whereas streamlit seems to be more visually appealing. However, one big advantage streamlit has is the fact that it is operated by the command line, making it much more efficient for a person like me who’s very much comfortable with typing out commands and running stuff, rather than dragging my pointer around to click objects.

I used the geopandas library for dealing with shapefile data. It’s incredibly efficient and sort of extends native pandas commands to shapefile data, making everything much easier to work with. One thing I didn’t like was how streamlit didn’t have inbuilt functionality to view GeoDataFrames, so essentially that means I have to output geospatial data using the st.write() method, and that just results in some ugly, green-colored output, very much unlike the clean, tabular output you get when you use st.write() for displaying dataframes. It’s also a bit surprising how st.dataframe() doesn’t extend to a GeoDataFrame, but eh, it works for now.

Bokeh is new for me, I decided to start out with making plots using inbuilt geopandas and matplotlib functionality rather than move straight to Bokeh. Hence, in this post, I’ll be going through how I made and annotated some maps using geopandas, then extended that to Bokeh to make my code much more efficient. A huge advantage Bokeh brings to the table is that it can be used to store plots, so no going back to earlier cells to find a plot. Just output it to an html file and write some code to save the contents of that html file, you’re then good to go.

The very first thing I did was create a very basic plot showing Germany filled with a blue color. Here’s a small piece of code that accomplishes that.

Plotting Germany

This simple takes the geodata for the entire world, in low resolution. I then select the data specific to Germany, plot it using geopandas, turn plot axes off just to make it look better visually, and output a plot to my streamlit notebook (do I call it that?). Here’s the result

As you can see, it’s a very simple plot showing the nation state of Germany. Now, I thought I’d extend this, make it look better, and annotate it with the names of some cities and the capital, Berlin. The very first thing to do here is to get some data for each city, their latitudes and longitudes, to be specific. You can either import that via a shapefile, which will have each location as a point. However, I manually inputted 6 cities and their data from an internet source to a dataframe, and then used that to annotate my figure, which I’ll be talking about now.

The dataframe at the top contains my city position data. Down below, I’m creating another GeoDataFrame that holds each cities data as a point object. You’ll notice that I’ve used the points_from_xy() method while creating the city dataframe. points_from_xy() wraps around the point() method. You can view it as equivalent to [point(x,y) for x, y in zip(df.Longitude, df.Latitude]. It works really well and removes the need to have a for loop, making, making my code much more efficient. I’ve then plotted the same map as above, except with a white fill and a black outline (better looking imo). After that, I’ve gone over each point using a for loop and added a label, which is the City name (stored as just “Name”). I’ve also increased the size of the marker for Berlin, given that its the capital. The last step is just adding a red marker to indicate each city’s position. Note that st.pyplot() is a streamlit method that outputs any figures we might have. Here is the output of the code above.

I think this looks much better.

Now, I decided to plot a map showing all the main railway lines in India on top of a blank map of India, and output this to a new .html page as a bokeh plot.

As you can see the, the code for this is very simple. I’ve firstly plotted the map of India using the get_path functionality in geopandas. Then, for the sake of visbility, axes lines have been turned off. Then, I’ve read the railway shapefiles, which consists entirely of line objects, and plotted it using the pandas_bokeh library, outputting to an html file. Here’s the result

I find working with geospatial data to be terribly useful, and very much exciting. I think that’s partly because I love data science; playing around with data and modelling it, and working with geospatial data opens up an entire realm of possibilities. I’d describe the experience of making my first shapefile plot as something akin to working for the first time with time series data. Much like time, it adds another dimension to what one can do. In the coming weeks, I’ll be blogging more often, hopefully once every week, on geospatial data specifically. The contents of this post are not even an introduction to what can be accomplished using geospatial data. Next point of exploration, for me, is going to be how to depict elevations, and use color density to indicate population density, average cost of living, gdp etc. One immediate application that comes to mind is making a map that uses color to reflect how much covid-19 has impacted a country, based on not just cumulative confirmed cases, but also factors like healthcare expenditure, economic downturn, unemployment etc. I think it’ll be interesting.

You can find the entire code here.

Nihilism and Importance.

At the start, we humans thought we were the center of the universe. How naive. We thought this lie to be a universal truth that granted us special status in the universe, we were at the center of God’s creation, and thus the most prized of his inventions. Then, the heliocentric model placed the Sun at the center of the universe. In what was a huge blow to human pride, we came out as self-appointed winners on the basis of us still being God’s cherished inventions. We then learned that the Sun wasn’t the center of the universe, but instead the center of one of trillions of Solar Systems in constant revolution around the Milky Way center. Then, we found out that we were just another planet around another star in another galaxy that was one of trillions upon trillions. Now, there is a fair chance that we may just be one of quadrillions of universes.

While our insignificance in the cosmic scheme of things has nought but grown, we’ve still found ways to make ourselves feel unnecessarily special. The first of these is the proclamation that we’re the only known form of intelligent life in the known universe. Now that might hold true what of what importance is being intelligent when regardless of its existence, the horsehead nebula still remains untraveled, the TRAPPIST system remains unexplored, and the flat earth theory still remains believable to an absurd number of people. We’re intelligent relative to the natural life on Earth (although maybe not, octopuses display far more intelligent behavior than us). On a cosmic scale, the human species is to the universe what the the flutter of a butterfly’s wing is to storms on Jupiter, irrelevant. In fact, the argument that being intelligent is being special is in itself flawed; an intelligent species wouldn’t knowingly harm its ecosystem and put politics above science.

So, one may ask, what does make us significant? Is it love? friendship? Science? The former two make individual lives important, but not an entire civilization. Love isn’t a metric for progress, scientific progress is, and its measured by the Kardashev scale. The Kardashev scale measures a species’s technological progress. I denotes optimal utilization of all planetary resources. II denotes optimal utilization of host star’s energy output. III denotes optimal utliization of your entire galaxy’s energy resources. We haven’t even reached point I yet, and there’s no saying we ever will given the way we’ve used our planet’s resources is destroying it. The only way to become significant in a cosmic sense is to expand our barriers and become a space-faring civilization. Our solar system provides a variety of environments, everything from the hellish landscape of Venus to the subsurface oceans of Europa, populating them is both an engineering challenge and a stepping stone to going beyond.

Having a measurable impact requires scientific progress, but humans would rather continue playing tic-tac-toe rather than switch to Minecraft. Rather than adopt incredible challenges in the form of navigating the harsh environment of space and settling humans on other bodies, our politicians choose to fight and debate and argue endlessly. That’s because the entire concept of modern politics is somewhat flawed. Politicians can’t make decisions that balance on the scale of decades, even centuries. They cannot comprehend the importance of science, because anyone who is mildly scientific would never go into politics, and fixate themselves on which country did what and their endless agendas rather than the one big agenda that should occupy their mind: what is our place in the universe?

And it goes without saying that space offers more economic potential than anything else, so the entire notion of immense space exploration budgets drawing money away from the economy and facets like healthcare and education is founded in quicksand. Asteroid mining offers returns on the level of trillions of dollars, if not more. Lunar colonization offers dramatic reductions in the cost of space exploration and a gateway to other bodies in the Solar System. Europa and offer the chance to find life beyond Earth, or at least understand how modern Earth came to be. Space offers a dizzying array of possibilities, but we ignore them all.

As the Ancient one put it, we’re a “momentary speck within an indifferent universe,” having an impact requires understanding of just how important technology and science are.

But then again, do people really get that any longer? Ancient humans had access to the universe via a tool that is ironically useless today, the naked eye. Looking up offered a window into a larger world, unlike today where the night sky is pitch black like in all of Enid Blyton’s stories. This sparked curiosity, it led to the development of calendars and timekeeping and birthed the concept of studying, which has historically been awesome. Even astrology came into being, one of our first attempts to explain how the universe impacts us. We’ve lost sight of that burning curiosity now. People think they have access to a whole new world through their smartphones, but no one realises what we’ve lost along the way. We’ve lost the reason why we began being intelligent: the night sky.

Solving the UEFA Champions League problem on Code Chef with OOP

So I decided to try out code chef, and chose to solve the UCL problem listed on the home page of the ‘practice’ portion of code chef. The problem statement was detailed and easy to understand, you can view it here. It basically involves inputting the match results for a league in the following format:

<home team> <home team goals> vs. <away team goals> <away team>

for instance, fcbarca 5 vs. 1 realmadrid

12 match results are inputted, and match results for T leagues are taken in. The output for each league should be the names of the winning team and the runner-up. For instance, the output for the following league schedule/results,

manutd 8 vs. 2 arsenal
lyon 1 vs. 2 manutd
fcbarca 0 vs. 0 lyon
fcbarca 5 vs. 1 arsenal
manutd 3 vs. 1 fcbarca
arsenal 6 vs. 0 lyon
arsenal 0 vs. 0 manutd
manutd 4 vs. 2 lyon
arsenal 2 vs. 2 fcbarca
lyon 0 vs. 3 fcbarca
lyon 1 vs. 0 arsenal
fcbarca 0 vs. 1 manutd

would be

manutd fcbarca

Here are some basic rules we observe for football tournaments. Each victory gives you 3 points, each lose gives you 0 points, each draw gives you 1 point. If two teams have the same point tally, the team with the higher goal difference is placed first. For simplicity’s sake, we’re assuming here that no two teams with the same point total will have the same goal difference.

I defined two classes, League, and Team right at the start. The only attribute an instance of the League would have would be table, which is a dictionary wherein the key is a string pertaining to the team’s name and the value if a Team object containing all necessary information. I also put five methods inside the League class, which I’ll be explaining just in a bit.

League Class

The does_team_exist() method just returns true or false values pertaining to whether or not the inputted string is a key in the dictionary field. In my main code, if does_teaam_exist() returns False, the add_team() method is called and the self.table attribute is updated with a Team object being added to it. If does_team_exist() returns True, the program continues. match_won() and match_lost() are pretty well defined. The former calls the win() method for the Team object corresponding to the Team that won and the lose() method for the Team object corresponding to the Team that lost. You’ll notice that we also pass in the goal difference here. The latter calls the same draw() method for both, and no gd is passed in because it equals 0 for both times in the event of a drawn match.

return_top-two() is our most important method. It creates a list of all values in the table dictionary. Then, I used a lambda function to sort on the basis of two parameters, points and goal difference. You’ll notice that my first element in the lambda function inside the tuple is the number of points associated with that object, and the second is the goal difference for that team. After sorting all Team objects, I return the team names of the last and second last objects.

Now, I’ll show you the Team Class.

Team Class

This is much simpler. When initializing, the only argumentative input is the team name, while points and team difference are initialized to 0. The win() method increases points by 3 and goal difference by the argumentative input. the draw() method only increments points by 1, and the lose() method decrements goal difference by gd. This is followed by three get methods, two of which are used in the lambda function shown above, and one which is used to get the desired output for our program.

Now, the only part left is our main class or working code. Here it is.

Code

We take in T as our first input as defined by the problem statement, then iterate over each League. Before going into the results for each league, I’ve initialized league to an instance of the League class. I then iterate over each of the 12 fixtures in each league. Splitting the input and extracting certain indices (see code) gives us our values for each team’s name and the number of goals it scored. You can see that I have two “if not”s up there that initialize a new team object inside the table attribute of the League class if the team in question is not there already. Then, I compare home team goals and away team goals and call the appropriate method where needed. Our output is the value returned by the return_top_two().

You can check out the full code here.

Credits to code chef for the problem statement.

UI to mail NASA APOD image to inputted email [#2 NASA APOD Series]

I’ll be going through using Flask to create a 2-webpage website with a simple form on the first page that takes in your email, clicking the submit button sends an image of the latest NASA APOD image to the inputted email. For this, I’m using the script written and explained in the last post in this series, which can be found here. So in this post, I’ll be going through the flask templates i created, and my form functionality, which is pretty simple.

The first step is to create a .html page and a main.py file for our website. Here’s the code for the following.

<body> tag of .html file

So the image above shows the code I’ve written for the body tag, which is all that’s worth looking at. The form data returned in the main.py file is used to just show the form and a submit button, very simple functionality so I’m not going to go that much into detail for this part. We only have an email field and a submit button, clicking the submit button firstly checks whether or not the email field has a valid input. If it doesn’t, the field is cleared and a red outline is put on it. If it is, an email containing the latest NASA APOD image is mailed to the inputted email, and the user is redirected to another webpage.

This is the code in my main.py file. I’ve initialised my Flask app with the first line and given it a secret key (for forms). The send_email() function contains the code explained in the previous post. It’s called if and only if the submit button is clicked on the home page and a valid email address is filled out. If these conditions are satisfied, the send_email() function is called with the inputted email as the only argument. The code then redirects the user to the bye.html page, which is just a small html file displaying the words “bye.” I did it for the sake of seeing whether or not the send_email()function has been called inside home.

My functionality for setting up the email form is very simple. I use the wtforms module.. I’ve created emailButton class wherein the the email is set to a StringField object, and submit is set to a SubmitField object. The library provides validator methods. DataRequired() ensures that some data is inputted into the field, and Email() ensures that the given input is a properly formatted email.

This is what our webpage displays. After entering a valid email and clicking submit, here is what I’ve received in my inbox.

Next time I’ll have a more styled layout, and I’ll go through the countdown functionality that will send emails to inputted email addresses until the number of inputted days becomes zero. Project should ideally be complete when that’s done.

You can check out the full code here.

Using Python to email NASA APOD API data to a custom recipient [#1 NASA APOD Series]

I’m doing a mini-project as a learning experience. The first step here involved two sepearate functionalities that I’ve now combined and will explain here. Firstly, I had to write a few lines of code to actually use the API and save the image it returns to the current directory. This was really easy. I then had to learn how to use smtplib to send emails in python. I used this article on Real Python to do so. This post on Stack Overflow was also helpful in instructing me about attaching image files to an email. You can find the entire code in here, on my GitHub. I’ve been having some issues with WordPress plugins, so I’ve just attached screenshots of my code for this post.

right, so the first thing we’ll be looking at is accessing the NASA APOD API and storing the latest NASA Astronomy Picture of the Day Image in the current directory.

I firstly import the necessary modules. After that, we’ll access the API with our key. The url contains the link given in the offficial documentation for accessing latest image. The API returns a json object with keys pointing to a description of the image, a hd url, and a nominal quality url. I’ve used the urllib module to fetch and store the image to our current repo (stores to which repo we are in or we have changed to with os). Now comes the hard part.

You can take a look at all the modules I’ve imported in the GitHub link i’ve posted above.

Above, we start our smtp_server and initialize our port. I’ve used 587 because I’m using the TLS method, I think you’ll need to use 465 if you use the other method. I’ll be using mehulpy@gmail.com to send this email. This is a throwaway account I created with all security features turned off. I’m mailing it to mehuljangir42@gmail.com, which is one of the emails I use. We then input the password for the email we will be sending it from. Then, we’ll our message will be set to an instance of the MIMEMultipart objective with “alternative” as our input parameter. We then key in some values, like the email to use, the email to mail it to, and the Subject of our email. I’ve then specified html and plaintext messages for the email to contain.

I’ve attached both html and plaintext messages to our email. The procedure for attaching a .png file as an email attachmetn is very simple. Using the inbuilt open subroutine in python, we read the image. Then, we create an instance of the MIMEImage object where we key in the img_data we just read and input the path of our image file. os.path.basename returns your current path, kinda like the pythonic way of writing pwd.

A SSL context is created after that. Check out this Stack Overflow post to learn more about what an SSL context is. We then move into a try except condition, so that we can catch any errors that come our way. A server is established, and server.ehlo() is basically the equivalent of waving hi to your friend across the room. We then start tls, which is a method of encryption for safe and secure communication across the internet. Finally, as a method of the server instance, I send the email to the recipient. The finally block quits the server regardless of whether or not the email was sent and whether or not an error was thrown.

I’ll now create a webpage where you can input your email, and a python script sends the current NASA APOD image to your email. So stay hooked for #2!

Also, FYI, here’s what the email looks like:

Conducting an online MUN and Presiding over the SC.

This weekend, I had the opportunity of presiding over a simulation of the Security Council and being Secretary General of our school’s and my city’s first ever online MUN. I made a commitment to organizing a conference way back in August, at the start of my tenure as MUN Club President. COVID-19 provided me with a unique opportunity, the chance to conduct a virtual conference unlike other MUNs in how effective I wanted it to be and the fact that all committees would be working together to come up with solutions for the current pandemic. We had crisis updates in the General Assembly impacting proceedings in the Security Council, speeches PM Modi being broadcast in the GA, and the Security Council and WHO having a joint crisis on the basis of Iranian bioweapons allegedly supplied by China. It was truly a wonderful experience.

I had the idea of conducting such a conference over a month ago. I immediately reached out to my peers and teachers, and got the planning underway. We made a team, divided all the work, and got everything done very quickly. In fact, the hard work each team member put into the event, whether it was our logistics heads texting every single delegate, or our Vice-chairs who were tasked with working on Background guides, is commendable. Now, more about what I did.

As President of a Security Council meeting that was investigating COVID-19’s origins and evaluating the possibility of it being a bio-weapon, the fact that it was a simulation did not draw away from how serious the situation was. Debate started with the Russian Federation introducing a verified, reliable research paper that clearly proved how COVID-19 couldn’t be artificial. I then urged our committee to investigate some more pressing prospects such as COVID-19 being a natural outbreak intentionally allowed to get out of hand and it being a sort of evolved form of SARS for which a vaccine already existed (concealed by the Chinese). After a round of statement speeches, I introduced the first crisis of the day, a video of a worker in the Wuhan virological center stating how the virus was laboratory made and intentionally released. Not soon after, one of our MUN mentors who has now graduated high school joined as the guest delegate of Japan, accused China of approaching him in informal session with an offer to trade the formula for the vaccine for economic and financial support, and abruptly left; committee had just livened up. Allegations flew across the floor during formal session, and insults during informal session. China became a sort of punching bag for the rest of the committee, soaking up countless jibes and taunts and resorting to reminding the other delegates about what I said and the source Russia provided.

I then took it a step further by releasing another video made by a resident in Wuhan claiming they were receiving ‘magic medicine’ from the authorities, and that all who refused to take it were being welded shut into their homes. This effectively meant that the Chinese had a vaccine, or had discovered an accidental cure. In either case, it meant China was concealing something important from the rest of the committee. This marked the end of the first day. However, at midnight, the General Assembly convened for a midnight crisis (a fascinating aspect of MUNing I saw happen for the first time), and witnessed PM Modi deliver a speech on how an Indian lab had created the vaccine. We then received an informal tip-off hinting upon a US-India partnership involving the US and India distributing the vaccine to their civilian populations before the rest of the world, and reaping the economic benefits of selling a highly demanded product to 189 countries at a very high price. I took this opportunity to text all members of my committee with a midnight update, alerting them to what had just transpired.

Committee convened again at 8 am in the morning. We debated on the draft resolution at hand for the first two sessions, cut away more than 80% of it and ultimately vetoed it, so 5 hours of effort went down the drain, not because we failed to pass a resolution, but because I failed to engage delegates in the resolution drafting process. Judging the results of an informal vote and what I believed to be best for committee, I decided to have a series of crisis updates for the last two sessions.

The first of these was an announcement by the Indian chief of army staff, detailing how the sole laboratory with the cure for COVID-19 had been blown up via a terrorist attack, and that the delegate of India to the Security Council had been assassinated. Dealing with the fact that the only verifiable vaccine was gone, and that China had a potential vaccine, all blocs split apart as China, Russia, and the US faced a series of allegations. This was followed by an announcement by the Russian Federation detailing how they had obtained the vaccine and were prepared to sell it to all other countries at subsidized prices, but would mass produce it in the public sector and not distribute the formula so as to reap colossal economic benefit.

So, that’s how committee ended. While we certainly didn’t have any progress towards satisfying our agendas, we had that kind of accusation-abundant, intense debate where blocs don’t really matter that brought out the best in our delegates. It was incredibly interesting to see our delegates handle incredibly absurd and riveting crisis updates, and do so while coping with incredibly arcane unmods. Conducting eJPISMUN was a brilliant experience, and the way our conference turned out exceeded the aspirations I had for it when I came up with the entire idea.