How to hack a free web hosting service with restricted access to external hosts and setup a free mailer service.

Yes you read it correctly there are two frees in one sentence of this post’s title. I am lucky to have a free web hosting service provided by freehostia. Its basic chocolate pack is completely free for life and has enough capacity to host my personal blog and website. But free things in life come with their own limitations as did this hosting service. This service doesn’t allow any connections to external hosts which means no contact forms on your website and no email forwarding to any of your personal email ids. The only solution was to use the mail service provided by hosting service provider which I was not willing to use as it would add another email to my already existing bunch of ids that I occasionally use. So after searching for the solution for a few days I had my light bulb moment. I thought how does it matter if it doesn’t allow any outgoing connections to external hosts it still allows incoming connections to the server. I can write the information of the contacting person from the contact form to some webpage and read that page using some script.
So this is the setup that I used. Firstly, I wrote the data coming from the contact form available on the website to a webpage along with its timestamp. Then I hosted a python script that ran every 10 minutes to scrape that webpage and mail me any contact information added to the page in the last 10 minutes. I used the heroku’s app hosting service to host the script and its scheduler add-on to run it every 10 minutes. And now I have a free mailer service for life. Following is the code for the script that runs every 10 minutes and scraps any contact information added in last 10 minutes and sends it as a mail. The data pipeline architecture is shown as follows:

Following is the code for the heroku scrapping and email script.

import smtplib
import requests, pytz
from datetime import datetime, timedelta
from bs4 import BeautifulSoup

class Gmail(object):
def __init__(self, email, password):
self.email = email
self.password = password
self.server = 'smtp.gmail.com'
self.port = 587
session = smtplib.SMTP(self.server, self.port)
session.ehlo()
session.starttls()
session.login(self.email, self.password)
self.session = session

def send_message(self, subject, body):
''' This must be removed '''
headers = [
"From: " + self.email,
"Subject: " + subject,
"To: " + self.email,
"MIME-Version: 1.0",
"Content-Type: text/html"]
headers = "\r\n".join(headers)
self.session.sendmail(
self.email,
'',
headers + "\r\n\r\n" + body)

class ReadPage(object):
def __init__(self):
message = ""
self.url = ''
page = requests.get(self.url, headers={'User-Agent': 'Chrome'})
soup = BeautifulSoup(page.text, 'html.parser')
data = soup.find_all('p')
now = datetime.now(pytz.timezone("America/Chicago")).replace(second=0, microsecond=0)
print("now: ", now)
for i in range(5, len(data), 6):
time = " ".join(data[i].get_text().split(" ")[1:4])
time_format = '%m-%d-%Y %I:%M %p'
time = datetime.strptime(time, time_format)
time = (pytz.timezone("America/Chicago")).localize(time, is_dst=None)
print("time: ", time)
if now - time <= timedelta(seconds = 600): message = message + str(data[i-5]) if message != "": gm = Gmail('', '') # you will have to enable logging in through less secure apps in google settings
gm.send_message("Someone contacted you!!", message)
ReadPage()


The form is live and working at www.tinkerer.in

How F.R.I.E.N.D.S taught me why to choose my distribution wisely

Like most 90’s kids I am also a crazy fan of the American television sitcom Friends or may be a bit crazier than others though I would leave that to the readers to decide. And like most die hard fans I too have the habit of playing random episodes of Friends while doing other chores. Even though I have watched all the 234 episodes a gazillion times I still enjoy the humour that each character plays in its own style. But deciding which episode to play at random is tedious when you have 234 episode choices to pick from. So I thought wouldn’t it be cool if I let my computer choose which episode to play at random for me.

Aha! What an opportunity to put my skills to good use. The task was pretty simple, choose a random episode from a random season and play it in VLC media player. So I researched and took help from a friend on how to choose random file from a random folder in command line and implemented that solution as a small bash script using sort command

#!/bin/bash
dir="../Friends"
season=`ls ${dir} | sort -R | head -1`
ep=`ls "${dir}/${season}/"*.mkv | sort -R | head -1`
/Applications/VLC.app/Contents/MacOS/VLC "$ep"

and set its alias to friends so that I could play a random episode just by typing friends in the command line.
I was happy that a random friends episode was just a couple of keystrokes away. And I started enjoying episodes from this self written script but my happiness didn’t last long. I observed that the more episodes I played the more they kept repeating which, I thought, ought to happen because under the hood sort method throws 234 faced die corresponding to each episode to play and you will sometimes get same number more than once but it wasn’t always the case. There were some episodes that were repeating quite too often and some not at all.
So to understand what’s happening here let’s recall some high school probability. If you throw an unbiased die there is an equal chance of getting a number from 1 to 6 because all the numbers have the same probability of 1/6 i.e. ideally you will get every number once in 6 throws. So it has a uniform distribution. However, if you have a biased die it may not be the case. This made me suspicious about the working of the script I implemented.
To investigate this I visualised the output of the script by running it a million times and recording the output and found something unexpected. For data nerds out there this process of running experiment multiple times is called Montè Carlo simulations.

Apparently the 234 faced die that the sort method of the script used was not completely unbiased as I suspected earlier. As you can see it has a non uniform distribution across episodes which explains why it ended up playing some episodes more often than others.

So, I had to throw away the whole script and think of an alternate solution starting from the underlying distribution. Since I didn’t know how to generate a uniform distribution from command line that’s where Python came to the rescue. I wrote another script, this time in Python, to generate a uniform distribution and pick a number randomly from that distribution which was like choosing a number from an unbiased 234 faced die.

#!/usr/bin/env python
import pandas as pd
import numpy as np
import subprocess
data = pd.read_csv('../Friends/frnds.csv')
p = int(np.random.uniform(0,234))
episode = data.iloc[p,:][1]
p = subprocess.Popen(["/Applications/VLC.app/Contents/MacOS/VLC", episode], stdout=subprocess.PIPE)
p.communicate()

Tadaaaa!!! This worked like a charm and gave the following distribution

which seemed pretty uniform in a million trials. Thanks to the uniform distribution life was sweet again and that’s how I learnt why it’s important to choose your distribution wisely.