At PipelineDeals, we have a growing number of
satellite apps—we still have one main monolithic Rails app (which we’ve started calling
p.core), but ever so slowly, we have ported bits & pieces of functionality out to smaller, separate apps. These apps communicate with p.core and each other in a couple of ways:
Essentially, our main Rails app (p.core) broadcasts events that happen in the system to any satellite app that pays attention. So, for instance, if a person’s email address was updated, we would send a pubsub message like
person:updated:123456 and the payload would include information about the attributes that changed.
Conversely, when a satellite app needs to make a change to a model in p.core, the satellite app will use our public api to write changes.
When using our own APIs, we’ve grown into a pattern for wrapping them with Ruby classes that feels very clean and predictable to us.
We on the PipelineDeals engineering team feel proud of the work we do! But you wouldn’t know it by looking at the previous (lack of) content on this blog. We aim to change that.
So why the name change? Reason 1: our previous WebTwoPointOh-style name for it,
PipelineDeals DevBlog, makes us want to take a nap. Reason 2: software engineering has a lot in common with plumbing (an idea I stole from @ngzax, who probably stole it from somewhere else).
At PipelineDeals, we deploy code frequently, usually 2-3x per week, and sometimes even more often. As all web application developers know, deploying is sort of a nervous process. I mean, sure, 99.99% of the time, everything will go perfectly smooth. All your tests pass, your deploy to staging went perfectly, all the tickets have been verified. There is no reason to fear hitting the button. And, the vast majority of the time, this is true.
But all web application developers also know that sometimes, there is a snag. Sometimes the fates are against you, and for whatever reason, something goes bust. Perhaps you have a series of sequenced events that must occur to deploy, and one of the events silently failed because the volume that /tmp is mounted on temporarily had a 100% full disk. Perhaps that upload of the new assets to S3 did not work. Perhaps you did not deploy to ALL the servers you needed to deploy to.
And then, the worst happens. For a short period while you are scrambling to revert, your customers see your mistake. They start questioning the reliability of your system. Your mistake (and it is yours, even if some bug in some server caused the problem) is clearly visible to your customers, your bosses, and your peers.
PipelineDeals is hosted on Amazon’s AWS cloud platform, and has been since 2007. During these years we have been exposed to two separate mass outages, all of which affected their US-EAST availability zone.
Compared to other AWS availability zones, US-EAST center is their busiest, has the cheapest hourly server rates, and (happens to be) the most prone to massive outages.
Given that the majority of our customer base happens to be closer to the east coast, we keep our servers hosted in US-EAST. What this means, however, is that we must be prepared to jump ship to another availibility zone with as little downtime as possible.
The developers at PipelineDeals are happy to announce a new version of our API. V3 introduces many changes, including the following:
We ended up hooking into
window.onerror, which we define right
after we add jQuery, right at the top of the page.
In March, Twitter unveiled “Kiji”, an effort to significantly reduce the impact of running the garbage collector in the Ruby Enterprise Edition (REE) runtime. Many have criticized their effort because it focused on the older 1.8.7 instead of the newer 1.9.x version of ruby. But for older, larger apps that still run Rails 2.x or need ruby 1.8, Kiji may be worth exploring.
MRI relies on a single heap that basically resembles a slab allocator. New slabs are created via malloc() and carved up into a fixed 40-byte slot. Each slot can hold a variety of ruby objects. When a slab becomes full, GC is invoked to clean things up by looking for non-reachable objects. If the slab is still full after GC, then a new slab is allocated.