NotesToSelf

NotesToSelf

DK  //  Factoids and occasional bits of useful information.

Nov 19 / 7:52pm

Google Wave is built for sales & trading desks (and a little on Chrome OS)

I finally got a Google Wave invitation (yaay) and have fooled around with it a bit. It's tough to really kick the tires when most of the people you would wave with don't have an account yet. The only other option is to wade into massive public waves that appear a bit chaotic. It's like when I first discovered usenet and electronic bulletin boards way back when. I had no idea what was going on and the geek factor was kicked up a notch. But it was also sort of cool. Anyway, here's former Lifehacker Gina Trapani explaining Google Wave at W2E:


Nevertheless, is it just me, or does Google Wave cry out for a trading desk application? I can see an enterprising outfit using Google's open source Wave protocol to bring trading communications into the 21st century. Between the persistent state of wave "documents" and the extensibility it offers with bots and gadgets, I could see Google Wave replacing many solutions firms currently depend on for internal and external communication. There are good structural reasons why it probably won't happen, but a little speculation doesn't hurt.

From my experience, investment banks currently use a patchwork of communication channels. Most have their own internal chat system, Bloomberg messaging/chat, email, AIM (well they used to use AIM), and the telephone. From a research perspective, notes are syndicated via email, Bloomberg, internal chat, proprietary blog-like systems, and (of course) hardcopy.

So what does Google Wave offer? From an inside-the-firm perspective, it's easy to see Wave helping traders, analysts, and salespeople collaborate around a central hub of information. That's the whole point of having a "desk" where people sit right next to each other - to improve communication. In a global enterprise, however, it can be difficult to achieve the immediacy market-making demands. Using a centralized waves to manage communications would certainly reduce the number of tools in use and provide a re-playable record of the day's activity. For example, currency traders in NY could replay or review a shared global wave as they take over for London, etc. Wave gadgets could also be created for the ever popular polls that get sent out to clients and other traders in the bank. In-line responses would also help organize the information in a single place rather than switching from chat to email to bloomberg, etc. etc. throughout the day. I could see a salesperson subscribing to a trading wave (it may be he can make a risk free trade by crossing with another salesperson), and maintaining a client wave (for those who choose to do so).

For firms with strong data infrastructures, I could see Wave paired with plotting and analytical extensions that could be used to share data and potential insights. Before Lehman's demise, LehmanLive was a great example of a firm moving to the web in a way that allowed the entire firm to leverage its data and analytics. For those of you who remember, imagine LehmanLive, POINT, and Google Wave all wrapped up into a single package, and you get where I'm going with this.

Many of the same benefits could be enjoyed by clients in separate sandboxed waves. And since firms can implement their own Wave system, client accounts could be created that access the firm's servers rather than Google's. And compliance will love it since wave's are persistent (again, see the playback feature). Those who want to do something shady will probably stick to the phone...

Of course, it's probably a long shot any of this will happen. The Bloomberg network effect has been well-documented. Everyone uses it because everyone uses it! As such, it can crowd out patience for another system. Furthermore, the wave approach isn't immediately familiar (though I have no doubt Wall Street would adopt the technology if it thought it would make more money). One might argue that, in liquid markets, information is already traveling pretty darn fast (particularly as computers cut humans out of the loop). In less liquid, over-the-counter markets, there's actually an incentive to fight transparency since it has a direct negative impact on profitability...though the drive to gain volume and market sustainability often drives the market towards transparency in the end. Finally, for structured products, the process is so darn long and complicated, who cares? Just tell the lawyers to hurry up!

A final thought on Google OS. I watched the presentation today and was tickled by a pointed question by a member of the audience that essentially asked "What can I do on Chrome OS that I cant' do on a regular browser?" The answer was along the lines of "uh, nothing really...but you won't get the really fast boot-up!" From an IT perspective, however, I could see Chrome OS being a godsend. Again, as an open source project, a firm could build Chrome OS into a netbook for use with a distributed workforce. If you are the aforementioned firm with a strong, web-enabled infrastructure (using Wave even!), an analyst or salesperson in the field could have instant access to most or all relevant data on the road, either using local storage or a wifi connection/vpn. Since all data is encrypted on the netbook (at least according to the keynote), it's essentially worthless (from a corporate perspective) to anyone who steals it. And netbooks are CHEAP.

Anyway, my two cents...
Filed under  //  finance   review   tech   video  

Comments (0)

Oct 28 / 10:46pm

Use numpy to flog your.flowingdata

As noted in a previous post, your.flowingdata.com (yfd) is a handy way to collect personal data. I've been collecting sleep, diaper, etc. data on my newborn son. Although yfd now allows users to calculate durations between specified events, the visualization of the information isn't quite to my liking and it's clear that errors in the data can make for some odd durations (e.g., my son slept for two days!). Numpy to the rescue!

For those of you who don't know, numpy is python's powerful array package. Rather than loop myself to death, I thought it made more sense to use of numpy's powerful slicing and masking features to clean up the data. These features make it easy to find data entry errors.

I use the Enthought python distribution for convenience sake (and because I can't resist all those libraries -- most of which I'll never use).   Below you'll find some screenshots that step through my little script. Refer to the complete code here. (Well, it's just a start really). The code is probably a bit verbose for what it does, but we all start somewhere.

The first step is getting the data into an array you can manipulate. For your reference, your.flowingdata yields data that looks like this:


As you can see, it's basically just events and timestamps (I'm not really making full use of the data types yfd offers, as shown by all the empty fields).

The code below creates a structured array. Typically, numpy arrays are made up of items of the same type. It occurs to me that this example isn't so great because I ended up sticking with strings (S10 = a ten character string), but you get the general idea. If you imagine a 2D array, you can define one column as floats, another as strings, and yet another as int, etc. I'm mostly interested in how much the little guy is sleeping, so the 'sleep_mask' variable creates a boolean mask of all the 'gnight' and 'gmorning' events (since they are mixed in with diaper changes and other random events).


We can use numpy's where() method to help us index the events we want. Now that I have an array of only gnight and gmorning events, I can offset the two (since they alternate) to see if there are any duplicates that might screw things up.


The first time I called 'errors', numpy returned something like the following (basically telling me when/where there are dupes):

array([('gmorning', '', '', '2009-10-24 23:45:36'),('gmorning', '', '', '2009-09-30 18:15:04'), ('gnight', '', '', '2009-09-23 21:00:03'), ('gmorning', '', '', '2009-09-23 19:15:03')])

I won't step through all the code here since it's available above, but you get the idea. One thing to watch out for: datetimes. I spent a lot of time trying to figure out the best way to handle the timestamps included with the yfd event data. There are ways to convert strings to ordinal numbers into datetime objects and back again, but really I wanted to manipulate the datetime objects directly to take advantage of numpy's array slicing and arithmetic. Luckily, numpy allows object types (technically, they are called 'dtypes'). This allows you to subtract one timestamp array from another to get the elapsed time without any conversions (though you'll have to convert at some point if you want to generate a human-readable string). Here's an example of the array you'll get at the end (heads -> sleep duration, start time, end time):


Another unexpected pain in the butt is TIMEZONES. Although yfd's UI shows the correct local time on the web page, the tab-delimited file uses UTC (GMT) timestamps. This actually makes sense if you think about it. If you travel a lot, you'll never be sure when something happened since your timezone isn't held constant. Keeping datetime in UTC solves this problem, though you have to convert to local time yourself if necessary. Handling timezones with python's datetime library, however, sort of sucks. I recommend checking out pytz. It makes timezone management a little bit easier.

Plans for the future include visualizing this data with either python or R (ggplot2 anyone?). Too bad I don't know R...
Filed under  //  life   python   tech  

Comments (1)

Oct 13 / 8:50am

Stock Ticker Orbital Comparison = COOL

Care of Flowing Data, Stock Ticker Orbital Comparison (STOC) is one of the coolest representations of the market I've seen. Although I can't see anyone really trading on top of this visualization metaphor, it does make one think of how correlations and other parameters might be represented via animation.

STOC was built using Processing, a Java-based visualization IDE developed at MIT. I understand there are Scala and Javascript versions in development as well. The closest python equivalents I can think of are NodeBox and Mayavi. In any case, STOC has swerve. Respect.

Filed under  //  finance   tech   video  

Comments (0)

Sep 25 / 7:17pm

Use your.flowingdata.com...for the children

Personal data capture is a meme that's gaining momentum. Products such as Nike+ and, more recently, Fitbit, target those who would like to monitor daily exercise and other activities. Websites that allow users to manually track how they use their time have also started to pop-up. For those of us that like to procrastinate, these monitoring tools can help by providing regular feedback. Watching a little line move in the right direction can be pretty motivating.

Of course, I don't use any of these services. For myself.

Nevertheless, as a new father, I've found that your.flowingdata.com is an easy and useful way to track the activities of my newborn son! The service uses tweets to capture pretty much any kind of data you'd care to record. There are electronic products (e.g., Itsbeen, basically a stopwatch on steroids) that help new parents keep track of when the baby last slept, ate, poo'ed, etc. They do not, however, capture that data for analysis. My wife and I would like to see the historical data to see if we can tease out some insights about our son (e.g., how much sleep does he need before he gets cranky?). We tried using an iPhone app called Blogger that helps parents keep track of these things, but it wasn't immediate enough. We ended-up writing down events on the nursery mirror with a dry erase pen, but I really wanted to track things via a single button press. By the time I've finished dodging multiple salvos of pee and poo, multiple diaper changes due to said peeing and pooing, spit-up, puking, and sundry other lovely activities (a testament to how much I love you, boy), I can't remember anything that's happened in the last five minutes, let alone the last hour or two. So far, your.flowingdata.com has been the answer.

your.flowingdata.com ('yfd') is a service based on Twitter. Users send direct messages to 'yfd' and can visit the site for simple visualizations. Users can also download tab-delimited files with all the data. But wait, there's more! One kind soul also created a simple yfd iPhone application that allows users to send an update (e.g. 'd yfd gnight') via a single button press. Each button can be customized as well. I have no use for Twitter, but yfd got me to open an account. We're still figuring out what we want to record, but the service's flexibility and ease-of-use makes it much more likely we'll actually use it.

yfd isn't perfect. There's no built-in way to, for example, calculate the time that has elapsed between two actions (e.g. going to sleep and waking up). One has to download the data and calculate durations manually (or create a script to do it). There are other visualizations available, though. As I mentioned, I find it's much more important to make it easy to capture data for something like this. If it's a pain to capture the data, there won't be anything to analyze on the back-end anyway.

So, if you have absolutely no interest in personal fitness, time tracking, etc., you may want to check out your.flowingdata.com...for the children.

UPDATE: yfd has been updated to allow the calculation of durations between defined actions. I'd love to be able to aggregate these durations over a given time period (i.e. daily, weekly, monthly, etc.) in the form of a bar chart or something. yfd does visualize the data, but in a slightly different way. Best if you just check it out through the "Explore" link on the yfd site.

Filed under  //  life   tech  

Comments (2)

Sep 11 / 9:03am

FriendFeed and Python

I had no idea FriendFeed was driven by python!

 http://www.tornadoweb.org/

Filed under  //  python   tech  

Comments (0)

Aug 14 / 9:54pm

Sqlite and SqlAlchemy

Although I'm beginning to think that it may make more sense to use something like PyTables to store time series data, it's hard to escape the ubiquity of relational databases in the enterprise. In tightly controlled corporate environments, PyTables might not even be an option. Since I'm on a database kick, I thought I might as well investigate ORMs (object relational mappers) and see whether they make sense (from an analyst perspective). SQLAlchemy (SQLA) is one of the big kahunas in the python community, though there are clearly many others (Django ORM, SqlObject, etc.).

I've come to realize SQLAlchemy doesn't promise that you'll write less code. It just promises that the additional code you write (when necessary) will be worth the additional power and flexibility it provides. SQLAlchemy allows the user to leverage the powerful idioms of the python language, provides a consistent "API" for multiple databases, and automates many database housekeeping details (e.g. transactions, joins, etc.). It also offers powerful reflection features that make accessing legacy databases simple. Furthermore, SQLAlchemy features an SQL expression language separate from the ORM so users can choose between SQL-like syntax or objects when appropriate, allowing the user to map tables to classes at will. Other ORMs bind tables and classes together tightly (a la the ActiveRecord pattern featured in Rails and other ORMs).

The documentation for SqlAlchemy is mostly good. It's good because it exists, it's maintained, and documents the complete API. The tutorials are instructive, but I felt they were a bit hard to follow since the author attempts to highlight different ways to do the same thing. This conflation of demo and tutorial makes it harder to keep track of what exactly needs to be instantiated and when. A separate interactive demo screencast + a more linear tutorial might have worked better.

Anyway, I won't cover SqlAlchemy's Expression Language here, it's available in the documentation and should make sense to those already familiar with SQL. The expression language essentially transforms SQL into method calls (e.g. "table.insert().values(values)" rather than "INSERT INTO table (fields) VALUES (values)").

So, using matplotlib's handy quote_historical_yahoo() function, we can replicate the database from a previous post using SqlAlchemy's declarative plugin. The declarative plugin allows the user to map tables to objects in a single step. The following code defines two tables, "assets" and "prices," and defines a function for pulling data for a given ticker from yahoo (adapted from the previous post on sqlite and python).

'''
SQLAlchemy ORM declarative example.
'''

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Table, Column, Integer, String, DECIMAL
 from sqlalchemy import MetaData, create_engine, ForeignKey
 from sqlalchemy.orm import relation, backref, sessionmaker, scoped_session
from matplotlib.finance import quotes_historical_yahoo
import datetime
import os

path = os.path.expanduser('~')  + \
     '/Dev/Data/AssetPrices/SQLite/assetprices_SA.sqlite'
 engine = create_engine('sqlite:////' + path, echo=True)
Base = declarative_base(bind=engine)

date1 = datetime.datetime(2009,1,1)
date2 = datetime.datetime.now()

#DEFINE TABLES#

class Asset(Base):
    __tablename__ = 'assets'
    
    asset_id = Column(Integer, primary_key=True)
    ticker = Column(String, unique=True)
    tag = Column(String)
    
    prices = relation('Price', order_by='Price.gregorian_day', backref='assets')
    
    def __init__(self, ticker, tag):
        self.ticker = ticker
        self.tag = tag
        
    def __repr__(self):
        return "<Asset('%s', '%s')>" % (self.ticker, self.tag)
    
class Price(Base):
    __tablename__ = 'prices'
    
    price_id = Column(Integer, primary_key=True)
    asset_id = Column(Integer, ForeignKey('assets.asset_id'))
    gregorian_day = Column(Integer)
    date_string = Column(String)
    year = Column(Integer)
    month = Column(Integer)
    day = Column(Integer)
    px_open = Column(DECIMAL)
    px_close = Column(DECIMAL)
    px_high = Column(DECIMAL)
    px_low = Column(DECIMAL)
    volume = Column(Integer)
    
    #asset = relation(Asset, backref=backref('prices',
                                            #order_by=gregorian_day))

    def __init__(self, gregorian_day, date_string, year, month, day,
                 px_open, px_close, px_high, px_low, volume):
        self.gregorian_day = gregorian_day
        self.date_string = date_string
        self.year = year
        self.month = month
        self.day = day
        self.px_open = px_open
        self.px_close = px_close
        self.px_high = px_high
        self.px_low = px_low
        self.volume = volume
        
    def __repr__(self):
        return "<Price('%s', '%s', '%s','%s','%s','%s','%s','%s','%s','%s')>" \
               % (self.gregorian_day, self.date_string, self.year, self.month,
                  self.day, self.px_open, self.px_close, self.px_high, 
                  self.px_low, self.volume)

#CREATE DB TABLES#
 
Base.metadata.create_all(engine)

#PACKAGE RAW DATA#

def package_data(db=None, ticker=None, tag='stock', start=None, end=None):
    '''
    package_data() uses quotes_historical_yahoo() to create a data set for a 
    given stock's price history. Date_string, Year, month, and day fields are 
    included for added flexibility. Returns a dictionary of tuples.
    '''
    raw_quotes = quotes_historical_yahoo(ticker, start, end) #list of tuples
    
    data = []
    for quote in raw_quotes:
        date_raw = datetime.datetime.fromordinal(int(quote[0]))
        year, month, day = date_raw.year, date_raw.month, date_raw.day
        date_string = date_raw.strftime("%Y-%m-%d")
        record = (ticker, tag, quote[0], date_string, year, month, day,
                  quote[1], quote[2], quote[3], quote[4], quote[5])
        data.append(record)    
    
    headers = ('ticker',
               'tag',
               'gregorian_day', 
               'date_string', 
               'year', 
               'month', 
               'day', 
               'px_open', 
               'px_close', 
               'px_high', 
               'px_low', 
               'volume')
    return {'data':data, 'headers':headers}
 

Executing this code essentially sets up the schema for an sqlite database and provides a package_data() function that will pull in data for a given ticker and date range. The "echo=True" parameter in  "engine = create_engine('sqlite:////' + path, echo=True)" will print out the SQL statements SQLA generates to the terminal.

Executing the code yields:

>>>
Evaluating SAscratchcode.py
2009-08-14 00:58:57,862 INFO sqlalchemy.engine.base.Engine.0x...b9b0 PRAGMA table_info("assets")
2009-08-14 00:58:57,863 INFO sqlalchemy.engine.base.Engine.0x...b9b0 ()
2009-08-14 00:58:57,863 INFO sqlalchemy.engine.base.Engine.0x...b9b0 PRAGMA table_info("prices")
2009-08-14 00:58:57,863 INFO sqlalchemy.engine.base.Engine.0x...b9b0 ()
2009-08-14 00:58:57,864 INFO sqlalchemy.engine.base.Engine.0x...b9b0
CREATE TABLE assets (
    asset_id INTEGER NOT NULL,
    ticker VARCHAR,
    tag VARCHAR,
    PRIMARY KEY (asset_id),
     UNIQUE (ticker)
)

2009-08-14 00:58:57,864 INFO sqlalchemy.engine.base.Engine.0x...b9b0 ()
2009-08-14 00:58:57,865 INFO sqlalchemy.engine.base.Engine.0x...b9b0 COMMIT
2009-08-14 00:58:57,866 INFO sqlalchemy.engine.base.Engine.0x...b9b0
CREATE TABLE prices (
    price_id INTEGER NOT NULL,
    asset_id INTEGER,
    gregorian_day INTEGER,
    date_string VARCHAR,
    year INTEGER,
    month INTEGER,
    day INTEGER,
    px_open NUMERIC(10, 2),
    px_close NUMERIC(10, 2),
    px_high NUMERIC(10, 2),
    px_low NUMERIC(10, 2),
    volume INTEGER,
    PRIMARY KEY (price_id),
     FOREIGN KEY(asset_id) REFERENCES assets (asset_id)
)

2009-08-14 00:58:57,866 INFO sqlalchemy.engine.base.Engine.0x...b9b0 ()
2009-08-14 00:58:57,868 INFO sqlalchemy.engine.base.Engine.0x...b9b0 COMMIT


This output shouldn't be too surprising. We've basically just created the tables we defined. So let's experiment interactively and create an asset object for Google.

>>> GOOG=Asset('GOOG', 'stock')
>>> GOOG
<Asset('GOOG', 'stock')>
>>> GOOG.ticker
'GOOG'

As you can see, it's possible now to call attributes of the GOOG object by name (e.g. ticker).

In our table definitions, we used SQLA's relation() function (e.g. prices = relation('Price', order_by='Price.gregorian_day', backref='assets')) to define a one-to-many relationship between an asset and its prices. SQLA uses the foreign key defined in the prices table to automatically generate the correct SQL. This is particularly interesting for sqlite users as sqlite parses foreign key statements but does not enforce them. Using this relation function, we can actually use dot notation to look at the GOOG objects prices attribute, just as if the prices are part of the object.

>>> GOOG.prices
[]

The prices are represented by an empty lists since we haven't actually written any prices into the database yet. So insert some prices.

>>> raw = package_data(ticker='GOOG', start=date1, end=date2)
>>> raw['headers']
('ticker', 'tag', 'gregorian_day', 'date_string', 'year', 'month', 'day', 'px_open', 'px_close', 'px_high', 'px_low', 'volume')

The package_data() function returns a python dictionary, {'data':[(list of tuples)], 'headers',(tuple of headers)}. So how do we assign the prices to the right ticker? The obvious way to do it would be to use a list comprehension to create a list of Price objects, and assign them to the GOOG object's "prices" attribute.

>>> GOOG.prices = [Price(record[2],record[3],record[4], record[5],record[6],record[7],record[8],record[9],record[10],record[11]) for record in raw['data']]
>>> GOOG.prices
[<Price('733409.0', '2009-01-02', '2009','1','2','308.6','321.32','321.82','305.5','3610500')>,
<Price('733412.0', '2009-01-05', '2009','1','5','321.0','328.05','331.24','315.0','4889000')>,...]

I've just listed the first two records to save some space, but you get the picture. Now, the import things to recognize here is that no SQL has been issued to the database yet. In order to reduce the back and forth between the database, SQLA uses a Session() object to keep track of what's going on. So let's setup a session and add our GOOG object to the session so SQLA can track it.

>>> Session = scoped_session(sessionmaker(engine))
>>> session = Session()
>>> session.add(GOOG)
>>> session.commit()
2009-08-14 01:10:15,424 INFO sqlalchemy.engine.base.Engine.0x...b9b0 BEGIN
2009-08-14 01:10:15,425 INFO sqlalchemy.engine.base.Engine.0x...b9b0 INSERT INTO assets (ticker, tag) VALUES (?, ?)
2009-08-14 01:10:15,425 INFO sqlalchemy.engine.base.Engine.0x...b9b0 ['GOOG', 'stock']
2009-08-14 01:10:15,471 INFO sqlalchemy.engine.base.Engine.0x...b9b0 INSERT INTO prices (asset_id, gregorian_day, date_string, year, month, day, px_open, px_close, px_high, px_low, volume) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
2009-08-14 01:10:15,471 INFO sqlalchemy.engine.base.Engine.0x...b9b0 [1, 733409.0, '2009-01-02', 2009, 1, 2, '308.6', '321.32', '321.82', '305.5', 3610500]
2009-08-14 01:10:15,472 INFO sqlalchemy.engine.base.Engine.0x...b9b0 INSERT INTO prices (asset_id, gregorian_day, date_string, year, month, day, px_open, px_close, px_high, px_low, volume) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
2009-08-14 01:10:15,472 INFO sqlalchemy.engine.base.Engine.0x...b9b0 [1, 733412.0, '2009-01-05', 2009, 1, 5, '321.0', '328.05', '331.24', '315.0', 4889000]
...(and it continues on)...

So the code above basically creates a session, adds our GOOG object to the session, and then commits all changes. The commit() method signals SQLA to issue all the necessary SQL in a single transaction to our sqlite database. Now that there are actual prices in the database, we can check out GOOG.prices:

>>> GOOG.prices[0]
2009-08-14 01:18:57,881 INFO sqlalchemy.engine.base.Engine.0x...b9b0 BEGIN
2009-08-14 01:18:57,883 INFO sqlalchemy.engine.base.Engine.0x...b9b0 SELECT assets.asset_id AS assets_asset_id, assets.ticker AS assets_ticker, assets.tag AS assets_tag
FROM assets
WHERE assets.asset_id = ?
2009-08-14 01:18:57,883 INFO sqlalchemy.engine.base.Engine.0x...b9b0 [1]
2009-08-14 01:18:57,885 INFO sqlalchemy.engine.base.Engine.0x...b9b0 SELECT prices.price_id AS prices_price_id, prices.asset_id AS prices_asset_id, prices.gregorian_day AS prices_gregorian_day, prices.date_string AS prices_date_string, prices.year AS prices_year, prices.month AS prices_month, prices.day AS prices_day, prices.px_open AS prices_px_open, prices.px_close AS prices_px_close, prices.px_high AS prices_px_high, prices.px_low AS prices_px_low, prices.volume AS prices_volume
FROM prices
WHERE ? = prices.asset_id ORDER BY prices.gregorian_day
2009-08-14 01:18:57,885 INFO sqlalchemy.engine.base.Engine.0x...b9b0 [1]
<Price('733409', '2009-01-02', '2009','1','2','308.6','321.32','321.82','305.5','3610500')>

Using normal python slicing syntax, we've just called up the first record in the prices table for Google. In this case, SQLA uses "lazy loading" to pull the appropriate Price object by issuing the SQL on demand. Users can choose to 'eager load' the data as well. Now that the corresponding Price object has been pulled we can inspect other attributes.

>>> GOOG.prices[0].px_high
Decimal("321.82")
>>> test_run = [(record.date_string, record.px_close) for record in GOOG.prices[0:10]]
>>> test_run
[(u'2009-01-02', Decimal("321.32")), (u'2009-01-05', Decimal("328.05")), (u'2009-01-06', Decimal("334.06")), (u'2009-01-07', Decimal("322.01")), (u'2009-01-08', Decimal("325.19")), (u'2009-01-09', Decimal("315.07")), (u'2009-01-12', Decimal("312.69")), (u'2009-01-13', Decimal("314.32")), (u'2009-01-14', Decimal("300.97")), (u'2009-01-15', Decimal("298.99"))]

In the example above, we call up the high price for the first record in our table. 'test_run' simply creates a list of tuples, using the date_string and px_close fields.

Anyway, there's a lot more to SQLA, this just scratches the surface. We'll see how deep the rabbit hole goes!

Filed under  //  finance   python   tech  

Comments (1)

Aug 14 / 10:16am

Apple Airport troubles

For those of you who may be having mysterious airport troubles (e.g. your laptop can't see, let alone connect, your airport extreme), you might try downloading this combo update from Apple. It doesn't show up when you search the support site. It's a pretty hefty download, but it seemed to work for me.
Filed under  //  tech  

Comments (0)

Aug 8 / 7:55am

Python and Powerpoint

I recently talked to someone who was interested in integrating the different MS-Office products programmatically. The obvious solution is VBA, since it's built-in. I have no desire to learn VBA, but Python does offer the win32 COM interface. I'd almost forgotten since I've been using a Mac for a while. Anyway, I ran across this short tutorial on using COM and Python to automate the creation of powerpoint slides. I used COM with excel a while back, but it was slow (and thus turned to the very nice xlwt/xlrd combo to manipulate excel files). Nevertheless, I can see it coming in handy if you are constantly updating slides with essentially the same, but more recent, data.

UPDATE: I recently learned of reStructuredText, which many python tools use to create documentation from plain text files. There are tools such as S5, Bruce, and rst2pdf that facilitate the creation or display of presentations in different ways.

Filed under  //  python   tech  

Comments (0)

Aug 6 / 5:32pm

Sqlite or Pytables or Text?

I'm wondering whether it makes more sense to store time series data in sqlite or a hierarchical database (like PyTables, which is based on the HDF5 format). Or maybe even straight-up text files!

Sqlite is nice because it runs everywhere and can connect to almost anything. Could serve as the 'Rosetta Stone' for slinging data around.

But PyTables is nice because it integrates with multidimensional numpy arrays and offers object-like convenience, meaning inter-row analysis is probably easier. Pytables is probably faster than sqlite but that's not really a big concern for me. Both hold plenty of data.

Text files, like CSV, are dead-simple and immediately accessible, but would require more logical work.

Decisions, decisions...

Filed under  //  python   tech  

Comments (0)

Aug 4 / 8:42pm

Use python and sqlite3 to build a database CODE

Click here to download:
YahooSqlite.py (4 KB)

Of course, I forgot to attach the code to the previous post.

Filed under  //  finance   python   tech  

Comments (0)