NotesToSelf

NotesToSelf

DK  //  Factoids and occasional bits of useful information.

Nov 21 / 12:46pm

HBO Rebroadcast: Pacman vs. Cotto at 10pm

If you missed the Cotto vs. Pacman fight on PPV last week, it's worth watching the rebroadcast tonight on HBO. I did not expect Manny to dismantle Cotto so completely. Unbelievable.

Comments (0)

Nov 19 / 7:52pm

Google Wave is built for sales & trading desks (and a little on Chrome OS)

I finally got a Google Wave invitation (yaay) and have fooled around with it a bit. It's tough to really kick the tires when most of the people you would wave with don't have an account yet. The only other option is to wade into massive public waves that appear a bit chaotic. It's like when I first discovered usenet and electronic bulletin boards way back when. I had no idea what was going on and the geek factor was kicked up a notch. But it was also sort of cool. Anyway, here's former Lifehacker Gina Trapani explaining Google Wave at W2E:


Nevertheless, is it just me, or does Google Wave cry out for a trading desk application? I can see an enterprising outfit using Google's open source Wave protocol to bring trading communications into the 21st century. Between the persistent state of wave "documents" and the extensibility it offers with bots and gadgets, I could see Google Wave replacing many solutions firms currently depend on for internal and external communication. There are good structural reasons why it probably won't happen, but a little speculation doesn't hurt.

From my experience, investment banks currently use a patchwork of communication channels. Most have their own internal chat system, Bloomberg messaging/chat, email, AIM (well they used to use AIM), and the telephone. From a research perspective, notes are syndicated via email, Bloomberg, internal chat, proprietary blog-like systems, and (of course) hardcopy.

So what does Google Wave offer? From an inside-the-firm perspective, it's easy to see Wave helping traders, analysts, and salespeople collaborate around a central hub of information. That's the whole point of having a "desk" where people sit right next to each other - to improve communication. In a global enterprise, however, it can be difficult to achieve the immediacy market-making demands. Using a centralized waves to manage communications would certainly reduce the number of tools in use and provide a re-playable record of the day's activity. For example, currency traders in NY could replay or review a shared global wave as they take over for London, etc. Wave gadgets could also be created for the ever popular polls that get sent out to clients and other traders in the bank. In-line responses would also help organize the information in a single place rather than switching from chat to email to bloomberg, etc. etc. throughout the day. I could see a salesperson subscribing to a trading wave (it may be he can make a risk free trade by crossing with another salesperson), and maintaining a client wave (for those who choose to do so).

For firms with strong data infrastructures, I could see Wave paired with plotting and analytical extensions that could be used to share data and potential insights. Before Lehman's demise, LehmanLive was a great example of a firm moving to the web in a way that allowed the entire firm to leverage its data and analytics. For those of you who remember, imagine LehmanLive, POINT, and Google Wave all wrapped up into a single package, and you get where I'm going with this.

Many of the same benefits could be enjoyed by clients in separate sandboxed waves. And since firms can implement their own Wave system, client accounts could be created that access the firm's servers rather than Google's. And compliance will love it since wave's are persistent (again, see the playback feature). Those who want to do something shady will probably stick to the phone...

Of course, it's probably a long shot any of this will happen. The Bloomberg network effect has been well-documented. Everyone uses it because everyone uses it! As such, it can crowd out patience for another system. Furthermore, the wave approach isn't immediately familiar (though I have no doubt Wall Street would adopt the technology if it thought it would make more money). One might argue that, in liquid markets, information is already traveling pretty darn fast (particularly as computers cut humans out of the loop). In less liquid, over-the-counter markets, there's actually an incentive to fight transparency since it has a direct negative impact on profitability...though the drive to gain volume and market sustainability often drives the market towards transparency in the end. Finally, for structured products, the process is so darn long and complicated, who cares? Just tell the lawyers to hurry up!

A final thought on Google OS. I watched the presentation today and was tickled by a pointed question by a member of the audience that essentially asked "What can I do on Chrome OS that I cant' do on a regular browser?" The answer was along the lines of "uh, nothing really...but you won't get the really fast boot-up!" From an IT perspective, however, I could see Chrome OS being a godsend. Again, as an open source project, a firm could build Chrome OS into a netbook for use with a distributed workforce. If you are the aforementioned firm with a strong, web-enabled infrastructure (using Wave even!), an analyst or salesperson in the field could have instant access to most or all relevant data on the road, either using local storage or a wifi connection/vpn. Since all data is encrypted on the netbook (at least according to the keynote), it's essentially worthless (from a corporate perspective) to anyone who steals it. And netbooks are CHEAP.

Anyway, my two cents...
Filed under  //  finance   review   tech   video  

Comments (0)

Nov 17 / 8:28am

Trefis decomposes stock price

via TechCrunch:

Started by three engineers and math whizzes from MIT and Cornell (Manish Jhunjhunwala, Adam Donovan, and Cem Ozkaynak) who did time at McKinsey and UBS bank, Trefis breaks down a stock price by the contribution of a company’s major products and businesses. For instance, 51.3 percent of Apple’s stock price is attributed to the iPhone, 25.5 percent to the Macintosh, and only 7.7 percent to iTunes and iPhone apps. Don’t agree? You can change the underlying assumptions by simply dragging lines on charts forecasting the future price of the iPhone, its market share going out to 2016, and so forth. Every time you change an assumption, the price target changes accordingly.

So let's take a company we all love to hate, AT&T. The screenshot above shows how Trefis decomposes the company's stock price. You can click through to get a more in-depth breakdown of AT&T's business. There's also a social component to the service where subscribers can contribute their own customized models.

There aren't that many companies to choose from, but Trefis is still in the free period. I imagine users will have to pay for full access in the future. In any case, it seems like a neat toy.

Filed under  //  finance  

Comments (0)

Nov 2 / 6:40pm

ggplot2, plyr, and your.flowingdata

The previous post described how I went about cleaning up some yfd data using Python and numpy. I have no doubt it can be done in fewer lines of code, but I think the post described how useful it can be to manipulate arrays rather than looping through everything. With the data cleaned up, I hoped to visualize my newborn son's sleep schedule. I recently received an example that does the same thing as my python code, but in 3 lines! It uses R, ggplot2, and plyr. A few more lines can generate pretty plots like this (box plot of sleep length in hrs vs. start time):


As the plot above shows, my son doesn't sleep a helluva lot during the day. The boxplot also illustrates how volatile his night sleeping has been. This tells me I need to do a better job of getting the boy to nap during the day in hopes of producing longer and more restful sleep periods at night.

While Python has been my gateway drug into the world of programming, I've been itching to try out a plotting package based on R, ggplot2. R is a popular language in the statistics community that has enjoyed some good press recently. Anyway, my little sleep duration project seemed perfect for some R exploration.

After searching around on the Interweb, I managed to write some broken R code that didn't really do what I wanted. Luckily, Hadley Wickham (the author of plyr and ggplot2) took pity on me and offered up some example code to point me in the right direction. I was shocked at the efficiency of the example, particularly given all the wrangling I had to do in python. Now, just for the record, I'm not making any statements about R vs. Python. Hadley obviously created plyr and ggplot2 to make R easier to use, and I imagine the same could be (or already has been) done for python. I just lack the experience and education to know!

Anyway, plyr and ggplot2 are very nice libraries that offer yet more reasons to learn R. Thank you Professor Wickham! Between python and R, I've got to believe one can slice and dice almost anything. If I could only get rpy2 working...
Filed under  //  life   python   R  

Comments (0)

Oct 28 / 10:46pm

Use numpy to flog your.flowingdata

As noted in a previous post, your.flowingdata.com (yfd) is a handy way to collect personal data. I've been collecting sleep, diaper, etc. data on my newborn son. Although yfd now allows users to calculate durations between specified events, the visualization of the information isn't quite to my liking and it's clear that errors in the data can make for some odd durations (e.g., my son slept for two days!). Numpy to the rescue!

For those of you who don't know, numpy is python's powerful array package. Rather than loop myself to death, I thought it made more sense to use of numpy's powerful slicing and masking features to clean up the data. These features make it easy to find data entry errors.

I use the Enthought python distribution for convenience sake (and because I can't resist all those libraries -- most of which I'll never use).   Below you'll find some screenshots that step through my little script. Refer to the complete code here. (Well, it's just a start really). The code is probably a bit verbose for what it does, but we all start somewhere.

The first step is getting the data into an array you can manipulate. For your reference, your.flowingdata yields data that looks like this:


As you can see, it's basically just events and timestamps (I'm not really making full use of the data types yfd offers, as shown by all the empty fields).

The code below creates a structured array. Typically, numpy arrays are made up of items of the same type. It occurs to me that this example isn't so great because I ended up sticking with strings (S10 = a ten character string), but you get the general idea. If you imagine a 2D array, you can define one column as floats, another as strings, and yet another as int, etc. I'm mostly interested in how much the little guy is sleeping, so the 'sleep_mask' variable creates a boolean mask of all the 'gnight' and 'gmorning' events (since they are mixed in with diaper changes and other random events).


We can use numpy's where() method to help us index the events we want. Now that I have an array of only gnight and gmorning events, I can offset the two (since they alternate) to see if there are any duplicates that might screw things up.


The first time I called 'errors', numpy returned something like the following (basically telling me when/where there are dupes):

array([('gmorning', '', '', '2009-10-24 23:45:36'),('gmorning', '', '', '2009-09-30 18:15:04'), ('gnight', '', '', '2009-09-23 21:00:03'), ('gmorning', '', '', '2009-09-23 19:15:03')])

I won't step through all the code here since it's available above, but you get the idea. One thing to watch out for: datetimes. I spent a lot of time trying to figure out the best way to handle the timestamps included with the yfd event data. There are ways to convert strings to ordinal numbers into datetime objects and back again, but really I wanted to manipulate the datetime objects directly to take advantage of numpy's array slicing and arithmetic. Luckily, numpy allows object types (technically, they are called 'dtypes'). This allows you to subtract one timestamp array from another to get the elapsed time without any conversions (though you'll have to convert at some point if you want to generate a human-readable string). Here's an example of the array you'll get at the end (heads -> sleep duration, start time, end time):


Another unexpected pain in the butt is TIMEZONES. Although yfd's UI shows the correct local time on the web page, the tab-delimited file uses UTC (GMT) timestamps. This actually makes sense if you think about it. If you travel a lot, you'll never be sure when something happened since your timezone isn't held constant. Keeping datetime in UTC solves this problem, though you have to convert to local time yourself if necessary. Handling timezones with python's datetime library, however, sort of sucks. I recommend checking out pytz. It makes timezone management a little bit easier.

Plans for the future include visualizing this data with either python or R (ggplot2 anyone?). Too bad I don't know R...
Filed under  //  life   python   tech  

Comments (1)

Oct 13 / 8:50am

Stock Ticker Orbital Comparison = COOL

Care of Flowing Data, Stock Ticker Orbital Comparison (STOC) is one of the coolest representations of the market I've seen. Although I can't see anyone really trading on top of this visualization metaphor, it does make one think of how correlations and other parameters might be represented via animation.

STOC was built using Processing, a Java-based visualization IDE developed at MIT. I understand there are Scala and Javascript versions in development as well. The closest python equivalents I can think of are NodeBox and Mayavi. In any case, STOC has swerve. Respect.

Filed under  //  finance   tech   video  

Comments (0)

Oct 12 / 11:38am

Import AntiGravity

Just saw this...

Filed under  //  life   python  

Comments (0)

Oct 10 / 11:42am

Baby T-Pain

I wish it sounded like that...

Filed under  //  life  

Comments (0)

Oct 7 / 1:24pm

Freeset Helps Free the Indentured in India

Some friends of mine are hosting a talk by Kerry Hilton, the founder of Freeset. From the website:

Freeset exists specifically to provide freedom for women from the sex trade, women who were forced into prostitution by trafficking or poverty. These women didn't choose their profession — it was chosen for them.

Now, they're being offered a real choice. When they choose to work at Freeset, they can start new lives, regain dignity in their communities, and begin a journey towards healing and wholeness.

All profits from Freeset in Kolkata benefit the women (salary, health insurance and retirement plan) and are used to grow the business. This means more women can be employed and experience freedom.

The great thing is, when you buy a Freeset product, you directly participate in a woman's journey to freedom.


Freeset trains these women to make custom bags and tee shirts. I'm not sure how differentiated the bags are from other bags, but the story is pretty unique.

The talk starts at 2:30pm this Sunday in Tarrytown, NY at the Reformed Church of the Tarrytowns (42N Broadway, Tarrytown, NY). Stop by if you want to learn more.

Filed under  //  life  

Comments (0)

Oct 1 / 4:55am

Palantir Finance looks promising

Garry (one of Posterous' founders), highlights the latest offering from Palantir - Palantir Finance. It looks like it has pretty powerful charting tools. I've signed up for an account and will report back once I've fiddled with it. I'm excited to explore the data exploration capabilities of this new tool (and, of course, whether there's an API).

Filed under  //  finance  

Comments (0)