Home

Previous 20

Oct. 7th, 2008

How (not) to get to the aiport in Calgary

You're in Calgary at the (quite fun) HotNets 2008 workshop. Your flight leaves at 10:45pm, promising to be a brain-draining three-flight red-eye back home. You want some exercise before the flight, but you've checked out of your hotel already. Combine with a bit of a habit of enjoying going places under your own power, and the solution seems obvious:

Walk to the airport.

It turns out that, in Calgary, it's possible. You might even take this route. The first 6 miles are great. Through Nose Hill park on trails through knee-high grass, with a great view of the city and lots of friendly people walking happy puppies. After that, though...

Advice 1: Don't try to cut through the under-development extension to the airport blvd. Climbing through a (stationary) train is a bit nervous. Getting stopped by a river afterwords just sucks, because you know that you then have to climb through the train again. And before you ask, the river was too wide to jump, and it smelled like cow poop. I met the cows a bit later when I went around the area to the north. They looked very surprised to see a human walking past on the freeway, as about fifteen pairs of cow eyes swiveled to intently track my progress.

Advice 2: Just ... stop at the 6 mile spot and call a taxi. The rest of the route gets very freeway-like. The overpass over highway 2 is particularly noteworthy, with its knee-high guardrail and rushing traffic inspiring mild vertigo even in a climber.

Advice 3: Ignore the above. You'll be chuckling about the time you walked to the airport in Canada for years to come...

A few other photos.

(Many thanks to Carey Williamson for suggesting a much better route than the one I'd initially picked!)

Sep. 23rd, 2008

No power, no crepes


The trailing end of Hurricane Ike knocked out power for 85,000 people. As the old saying goes - no power, no crepes. (September 15th)

Sep. 21st, 2008

Four espressos not to miss in Seattle

I was in Seattle a while ago for Sigcomm 2008. While there, I decided to go on a personal tour de seattle coffee shops. What I found were four excellent places to grab an espresso. Most serve double ristretto shots, rich, with a deep red crema and a surprising acidity. Forgive the lack of detail - they're all very good and I'd go to any of them again. You can't get espresso like this in Pittsburgh, alas, though I'm perfecting my technique on the machine at the Intel lab. :)

  • Stumptown Coffee Roasters
  • Vivace
  • Victrola Coffee Roasters - not only very good espresso, but they offer cuppings on Wednesdays, which was quite fun. Nice selection of coffees to try.
  • Stickman Coffee - Yelpers complain about the service, but the espresso was the thickest I've had and quite tasty. My biggest complaint was that they allow smoking in the attached courtyard, which would otherwise be a great place to hang out with a laptop for an afternoon.

Sep. 2nd, 2008

Don LaFontaine

That Voiceover... [Chicago Tribune]. Trailers may never sound so good again.

Aug. 28th, 2008

I've discovered Twitter

http://twitter.com/dave_andersen.
Do you tweet? Post/link!

Aug. 27th, 2008

Networking geek humor

Brighten Godfrey gave a great outrageous opinions talk at sigcomm this year. [Warning: May only be funny to networking geeks.]

The naive reader will conclude that conjunctions went out of fashion in the mid 80's, and came back after the dot com crash. However, the inappropriately perspicacious reader realizes that this conclusion is subtly flawed, because the word "that" might be a pronoun, adjective, or adverb: not just a conjunction.

Aug. 25th, 2008

Hood to Coast Results

Our team did quite well overall at hood to coast. My legs:

leg 1:  46:19  6.56 miles (6:43)
leg 2:  34:50  5.00 miles (6:58)
leg 3:  57:02  7.79 miles (7:19)

overall:  2:18:11  19.68 miles  (7:01).

*just* missed my goal of doing sub-7 for the average. Ahh well, happy anyway. I blame it on one too many brownies at sigcomm. (Sigcomm update later.)

Team results:

18. 	Team Wildfire 	Pittsburgh, PA   Team #716
        25:07:24 	 7:39 	 18 / 283

Our time of 25:07:24 put us at 120th overall and 18th in the male mixed open. I'm not sure why we're listed there instead of male open, though, since we should have lost our "mixed"ness. We'd have been #26 / 214 in plain male open - not a huge difference.

For comparison, the winning team ran it in 16:58 (5:10 chip pace).

As hoped, however, we did beat the North American Distance Sprinters (the acronym was accompanied by matching van decorations...), at 36 / 283 (26:21:39, 8:01 pace), but we got whipped by a very fast women's team we played tag with for a while, the Pink Ladies (24:08:21, 7:21 pace). Very nice run overall. Now to proceed with two days of walking funny...

Aug. 12th, 2008

Running goodies

It's been a good running month. I decided to train a little bit for the upcoming Hood to Coast relay (two weeks from now). Each team has 12 members; each person runs 3 legs of about 6 miles. I'll be running legs 9, 21, and 33 (6.89 mi, 5mi, 7.79mi), for a total of 19.68 miles. Sadly, that's the longest leg and I'm not the fastest guy on our team, but I was greedy - it had the least punishing downhill section and the most time spent running on trail instead of road.

So, I did a quick ramp-up back to 14 milers on weekends and have been running about 37 miles per week for the last three weeks. Taking it easier this week before my knees start twitching, but it's felt suprisingly good - my last week was roughly three 8 milers and one 14 miler, plus a bit of cycling and swimming. I did sleep through cycling on Thursday, though - lack of sleep and lots of running finally caught up. I'm a fair bit over what I'd like to weigh for the race, but at least the distance feels good.

On the bright side of all of this, the BBC reports that running can slow the effects of ageing:

Running not only appeared to slow the rate of heart and artery related deaths, but was also associated with fewer early deaths from cancer, neurological disease, infections and other causes.

And there was no evidence that runners were more likely to suffer osteoarthritis or need total knee replacements than non-runners - something scientists have feared.

Bring on the miles!

Tags:

Jul. 28th, 2008

FAWN rack improved

My students found a shop at CMU where they could slice and dice the new FAWN rack. The results are awesome (and much more stable than the previous one):

Jul. 16th, 2008

Home-brewed cluster rack

Our FAWN (Fast Array of Wimpy Nodes) cluster is coming together. We've purchased 20 alix3c2 embedded boards, and have them netbooting and racked in our new home-brewed cluster rack:



The alix3c2s are larger than we think the nodes would be if we built them to purpose, but they're a great way to prototype the cluster. We've also found that you can power two alixes per power supply, and operating the power supplies at a higher load makes them run more efficiently, saving about half a watt per node.

(The goal of FAWN is to build a highly power-efficient cluster for doing key-value lookups, of the sort that Amazon and Facebook do when serving catalog requests -- fetch a picture of a book, an entry in someone's facebook page, etc.)

Jul. 11th, 2008

All vacation photos online

Erm - we took a lot of photos. Now online, courtesy of newly purchased space at Google:

And a new look for the blog, along with more prominent links to the RSS and Atom feeds.

Jul. 10th, 2008

Arvind Krishnamurthy talk: Incentives for P2P Systems

I hosted Arvind Krishnamurthy for an SDI seminar talk today. Some notes from his very fun talk, which is based on two of his recent research projects, BitTyrant and One-Hop Reputations for P2P [pdf link].

  • 20% of users use P2P systems. Enormously popular; 50-85% of Internet traffic.
  • Youtube: 1,000 TB of data per day. BitTorrent: 10,000 TB of data per day.
  • Prior P2P systems had very poor incentives: 1% of Gnutella users satisfied 50% of queries, and most Gnutella users didn't share anything.
Enter BitTorrent...
  • Tit-for-tat: Send data to the top k people who send you data. Split your upload capacity equally between these k peers.
  • Each round, pick one other peer to send data to "optimistically"
  • Trust is pairwise, which is robust: Peers trust only what they see directly.

Observation: This system results in considerable amounts of altruistic uploads by fast peers. If a peer has a lot of upload capacity, it splits all of its capacity among the top-k peers. If those peers are relatively slow, it may upload much more than it's able to download.

  • Low-capacity peers are altruistic: They're not fast enough to convince other people to upload to them, so they mostly make altruistic uploads.
  • Many high-capacity peers are matched with sets of lower-capacity peers, and so make altrustic uploads.
BitTyrant:
  1. Determine the exact (minimum) rate that you have to upload to a peer in order to convince it to upload back to you.
  2. Figure out return on investment: How much they upload to you vs. how much you have to upload to them.
  3. Upload to the peers that have the best return on investment.
Result: 25% of the downloads go at least twice as fast.
  • Also uses less upload bandwidth to achieve same or faster download rate.
  • If everyone uses BitTyrant: Overall performance is better, too.
  • But: If BitTyrant stops uploading at point of diminishing returns (doesn't upload more than it must), system-wide throughput is lower. This behavior is closer to block-based tit-for-tat (you give me one chunk, I give you one). BitTorrent by default is like progressive taxation: High-capacity people upload disproportionately more.

Overall BitTorrent system performance is poor:

  • 74% of swarms provide less than 50KB/sec.
  • 100-fold increase in upload contribution provides only a 2.7x improvement.
  • Problems: Lack of incentive to upload lots of data; lack of incentive to stick around after you finish downloading.

One-hop reputations can get most incentive bang for buck, in a way that is still scalable and decentralized:

  • Most popular 2,000 peers interact with 97% of all peers.
  • Use these popular peers as trust intermediates. Each peer remembers how much excess other peers sent to it. If A and B both send data to C, then each accrues a "credit" at C. Later, A can "spend" that credit at C in order to download from B, or vise-versa.
  • Robust to cheating intermediaries as long as some are honest. (handwavy, pointed to details in paper.)

Jul. 5th, 2008

Change of scenery


(Lava hitting the ocean at Volcano National Park)


Evening settling in on the Na Pali Coast, Kaua'i.

Jun. 27th, 2008

Rae Lakes loop 2008 - some initial photos

+ + = :)

Jun. 22nd, 2008

ACM awards ceremony; off on vacation

I just got back from the ACM awards ceremony. I was there because Bindu was one of the finalists for the student research competition for her work on Similarity-Enhanced Transfer. Awesomely, she received first place. Needless to say, I'm happy as a clam for her, and very glad that Michael and I had the chance to advise her on this research.

The awards ceremony was lots of things--very cool, a bit too long, and a nice reminder of how cool computer science can be. In that regard, it was really energizing; I left thinking "damn! gotta work harder!" A few that really jumped out:

  • Daphne Koller received a cool $150,000 with the Infosys Foundation award for her work on machine learning techniques.
  • Vern Paxson received the Grace Murray Hopper Award for his work in the late '90s that set the standards for Internet measurement; a lot of my own subsequent Internet measurement work was directly inspired by Vern's techniques and insistence on taking a careful and scientific approach to network measurement.
  • Randy Wang was recognized for starting the Digital Study Hall Project at MSR India, which is trying to revolutionize teaching materials in the developing world. Interestingly, it's been a great year for MSR India -- the gossip vine has it that Bill Thies, one of the hot job market candidates this year, turned down both Berkeley and Stanford to go work there, presumably to focus on his interests in creating technology for the developing world.(Note: This really is gossip; I haven't heard it from any party involved, so don't quote me!)

Finally, it was almost an embarassment of riches for Carnegie Mellon at the ceremony this year, which I've got to admit was really cool, if perhaps mildly awe-inspiring. It makes you remember that those really nice, unassuming folks I bump into every day in the hallway and in faculty meetings are also, if you'll forgive the phrasing, bomb-ass, famous researchers. Ed Clarke, of course, received the Turing Award; Randy Pausch received the outsanding educator award, Bindu's first place finish was mentioned above, two of the three dissertation award honorable mentions were to CMU people, and Avrim Blum, Randy Pausch, and Donald E. Thomas were all inducted as fellows.

And on that note, I leave for two weeks in the wilderness. A few days hiking the Rae Lakes Loop in the Sierra, and a week in Hawaii (hiking the Na Pali Coast on Kauai, then flying to the Big Island for adventures TBD).

May. 3rd, 2008

The joy of Ruby's Hpricot

Hpricot is a frighteningly useful HTML parser. USENIX requires HTML versions of papers at their conferences, so we're creating versions of Dan's Perspectives and Bindu's Dsync papers. Amar and Vijay started an automated framework based on the Hevea LaTeX-to-HTML translator a while ago, which we've now improved further.

(For some reason, automated TeX-to-HTML translators mostly suck, particularly if you want clean, XHTML-strict output. Hevea appears to be the best of the batch, but its output is still pretty awful.)

Enter Hpricot. Our cleanup.rb script uses it to automatically de-gunk most of the Hevea output. It's awesome for manipulating most any HTML, though. You can write things like:

doc = Hpricot(open("index.html"))
# Nuke credits and comments
doc.search("//meta[@name='GENERATOR']").remove
doc.search("//comment()").remove

and more complicated things, like removing extraneous "font" tags from the bibliography list:

doc.search("//dl[@class='refs']/dt/font").each { |f|
    f.parent.inner_html = f.inner_html
}

(That reads "for every font tag that appears inside a DT item that is inside a DL of class refs, replace the font tag and everything it contains with the stuff it contains.)

The downside is that since I can make Ruby about as functional as I want (minus the static typechecking. argh), and it has awesome libraries like Hpricot, it makes it a little harder to make myself do more serious work in Ocaml...

May. 1st, 2008

Petascale cosmology talk for systems researchers

Rupert Croft from CMU's physics department gave a talk to the SDI seminar today about petascale computational cosmology, targeted to computer scientists. Turns out they have some nifty problems.

Overview: Cosmology: The study of the universe as a whole: beginning, evolution, fate, contents

  • is it expanding, etc? A few #s.
  • contents: 10^80 atoms in observable universe (+other stuff)

Why does this matter? The only way we know that most of the universe is not made up of "normal stuff". Atoms = only 4% of matter/energy density of universe. Rest = dark matter + dark energy.

Facilities:

Data will be made public immediately when taken. Google Sky will be one interface to the data.

  • 6.5 petabytes of data per year
  • Bandwidth: Telescope to base - 10Gbit/sec. Base to archive: 2.5Gbits/sec.

Algorithms:

  • Stacking 10 years of images (one per three days) on top of each other to get higher precision images. Align, integrate, compensate for telescope motion, moving objects, etc.
  • Gravitational lensing: Produces small correlations in galaxy shapes. O(N^2) problem. N is very, very large.
  • Near-realtime transient event alerts - e.g., spotting supernovae between subsequent measurements (supernovae can remain bright for days or weeks, gamma-ray bursts last for just a few seconds, etc. Need gamma-ray telescope that can slew in a few seconds. (!) Custom satellites for this...)
  • Massive-scale simulations about the evolution of the universe: start from initial conditions, and then simulate physics from there. Gravity, gas dynamics, radiative processes, etc. One really nice gravity simulation that they ran for several weeks on Big Ben, a 2068-node Cray XT3 at the Pittsburgh Supercomputer Center

Challenges:

  • Load balancing
  • Some data structures mirrored by each thread.
  • Exploitation of as much parallelism as possible: scaling by running on faster procs and on more procs.
  • Data management
  • Visualization

Apr. 30th, 2008

Two new friends for my office


(Yes, they're squishables. I saw some at Rob and Tomas's place, and it was love at first... er, squish.)

Apr. 27th, 2008

Playing with Camels

Or, ocaml, as the case may be. A colleague at NSDI was assuring me that ocaml is actually a great language for systems programming and that I should look seriously at it for some datapository stuff. So as the equivalent of "hello world", I figured I'd play with it enough to understand how to print the lines of a file in reverse order. Keep in mind that I've never really used ml or ocaml, so an expert will probably cringe.

Ruby

puts IO.readlines("grades.txt").reverse

Ocaml

open ExtLib;;
List.iter (print_endline) (List.rev (input_list (open_in "grades.txt")))

Okay, that's not too bad. Ruby provides a bit more shortcutting (automatic conversion of lists to strings and the 'readlines' function), but they're basically the same, modulo the functional vs. object order of doing things. Oddly, the Ruby one is about twice as fast as the ocamlopt-compiled one, so I suspect I'm not doing things in The Right Way yet.

Apr. 16th, 2008

Wooo! Al Gore speaking at CMU's commencement

Full article. I may be in D.C. that weekend, but if I'm not, I'm going to break my habit of only attending the school of computer science commencement to go see him talk. How cool is that?

Previous 20