Img 2074
OK! Cucumber Stuff.

So yes, there's a lot of unfinished business I still have to write about. I don't get time to code anymore let alone write about anything.

Yet I've started something new. The demand for people who understand Cucumber has never been higher. My Inbox is nuts with companies needing help hiring QA folks who know Cucumber. So, as lazy as I am, I decided to put some Cucumber training videos on Youtube. Lazy? yes, because doing that is way easier than replying to each email I get, even if I had it setup as a form letter.

Here's the channel: Mitt Darko

I really think learning Cucumber is dead simple, there's only one difficult part: getting setup to actually have an environment capable of running cucumber tests. So this post is meant to address what you have to do to get going. Covering it all in a video would be terrible, especially given the amount of different environments you have to cater to (windows, osx, rbenv vs rvm, etc).

I'd decided I was going to do this series as something casual. Just me talking to you and you figuring things out along the way. Then the whole "getting setup" thing jumped in my face. I thought about distributing a Vagrant image (a virtual machine), making videos to explain how to get setup on windows and mac (per Darby's advice) and just came to one conclusion:

You're going to have to do this one on your own.

Am I a jerk? Certainly. Should I make it easier and distribute some sort of package? Maybe. Here's the thing, if you're going to pick this stuff up, you need to be self motivated enough to figure out some basics.

Here's what you'll basically need, you go figure it out and meet me in the next screencast:

  • A text editor. If you're new to this sort of thing, I'll recommend Sublime Text against my better judgement. There's always Github's Atom and my favorite editor vim. PS If you want to try atom, tweet me and I'll get you an invite.
  • A Ruby environment. Try RVM or rbenv. Go look around the sites for those two tools and make a decision. I don't care what you end up using and each site will recommend something different if you're on windows, so if you're on windows, go follow those paths.
  • A github account. It couldn't be easier. Just got to and sign up.
  • Some sort of git configuration that works on your machine. The easiest way to get setup is to download github's app after you sign up for the github account in prereq #3.

I promise this is a one time thing. From here on out I provide the code and you do the learning. If I actually spent the time providing y'all with an environment or tutorial on how to set one up, this would go for aeons.

If you really, really, really need a hand held during this phase, contact me at If you've got simple questions, I'm sure I can answer them. If you really need more than that, well, we'll figure that out.

The Needs of Widgets Inc.

This the first part in a follow up from a 2011 YouTube series I put together called Making Websites Faster. It follows up on some basic assumptions about writing Rails apps that have been outlined on this blog post.

It was supposed to be four parts. Well, that's growing.

Widgets Inc's Platform

You probably think this is where I start talking about the cloud. In a bit.

I hail from an old manufacturing town, which is but one of the reasons I always use Widgets Inc. as an example. Widgets Inc. sells physical widgets that are machined by large Widget dies.

They have a large AS/400 based product inventory control system. It's integrated all over the plant floor. It's integrated with some of their most major customers and suppliers. You're not replacing it, don't even think about it.

A few years ago, Widgets Incorporated's vendor for this application, started offering a SOAP/XMLRPC option. They have a product based off this for the different vendors to integrate with, but as you can guess, no one can figure out how to use it. Even some of the higher dollar purchasers can't afford a consultancy to figure this out, let alone bring people on full time as they probably really should.

Before you say this is contrived, it's how the "enterprise" works for most of the manufacturing sector.

Widgets comes to you wanting a catalog with online ordering. While you're at it, they'd like an overhaul of some of their basic meta pages and their landing page system for sales leads.

They don't want it to cost a whole lot, but volume is sufficient enough and integration is difficult enough to warrant skipping managed Rails application deployment environments.

They take about 10M uniques a month and project that they're going to be doing quite a bit of volume through the web site as far as sales go.

There's another snag. There are two separate iOS apps about to hit for Widgets Inc. Yes, my how progressive they are for manufacturing, but you really gotta keep up, and you have to support it. Here they are:

  1. An app for iPad which is meant for their 200 member strong global sales force. They can't just use the site, it's an app because there is integration with their CAD drawings and machining specs to ensure the customer is getting what they need.
  2. A kiosk type marketing tablet app, meant to showcase success stories, tell a bit about Widget's history, showcase some parts, and take emails and such for followups. It's a prototype right now, with minimal investment. This app has to be able to run offline, as it the tablets will live in places where there is no connectivity.

With both apps, once the payload is processed, appropriate parties should be emailed once the API is hit and the magic happens.

App one needs invoices and an intro letter for first time customers. Very basic.

App number two has requirements that are a little more complex. You see, Widgets Inc., founded by Mom and Pop Widget in 1902, is a family company. They know the customers in their sector can spot a form letter a mile away (and who can't really). The followup email has to be carefully crafted to ensure the customer is satisfied and the sales lead is converted. Since this is a MVP, they'd also like to know whether it's worth their while to continue investing in this program.

Your task, should you choose to accept it:

  1. Create a platform that's fast, no matter someone's position on the globe. It should be tolerant of being mentioned by Oprah.
  2. Track as much metric information as possible to ensure every department is doing the right thing and that they have the tools to further the growth of the business.
  3. Create a catalog with ordering and provide the same (and more) tools via an API.
  4. Create a landing system that works via API as well as the web. It needs to work well with iPad app #2, but also work well with all of the print, radio and Internet advertising that Widgets does.
  5. It needs an admin interface for when a customer calls in with a query. Having a customer explain every step they've done isn't just difficult, it's a "telephone game" where you're likely to get the wrong thing told to you. You need to be able to re-trace all the steps they've taken, to understand where they've been. If someone needs a line item amended, or they just want to call a whole order in, your customer service department should be able to handle this.
  6. The app should be able to be run for under $1000 a month. Any activity beyond this needs to be quantified.

Rails can't do this? You're right. Rails can do this with the help of the ecosystem you've already probably chosen to implement around your app, and probably still come in under the $1000 a month benchmark.

In the next post, we'll do some initial planning. We'll figure out how to implement a few white lies to get to Oprah scale. We'll sketch out the site map, and talk about strategy to each leaf of the tree, as well as figure out the best strategies for implementing each requirement.

Making Websites Faster - The Revisit.

So I've made some mentions of simple things I've done to make a Rails app or two run a bit faster, none of them are secrets.

In early 2011, I'd put together some Screencasts for the Youtubes about different caching methods, deferring long running jobs to workers, and other things. You can view them here: Making Websites Faster, but they're pretty out of date.

I've had a lot of folks ask me questions about these things lately, and the main answer is always the same: Avoid Rails and Ruby.

HAHA. Yeah, right? I make my living on this framework, don't get me wrong, I think it's the best, most concise, easy to develop for, quickest to shipping framework out there. The language, Ruby, and the framework, Ruby on Rails aren't inherently slow.

So why is it that "Rails Doesn't Scale"? Why do we fight to get our users a decent experience? Why is it an uphill fight to get things perfect as an app grows?

Why even bother making it faster? Common assumptions I've seen that are killing what could be a great app:

Assumption #1: You're Building a Rails App

You're not. Even if you are, you're not. Even if you're building an Ember app riding on Rails API, you're still not. You're creating a website. You're creating something that speaks to a whole domain -- even you're not directly responsible for anything other than your corner of it, you've never building a Rails app, you're building a product. A product needs a platform. Rails is a poor platform, it's just a toolset.

Assumption #2: Your Platform is Ruby/Rails

You're a developer. You're writing a Rails app. Rails runs with Rack. You kinda need a web server at this point. You may in fact need two web servers as some will require a reverse proxy from one like Nginx or Apache.

At some point you'll probably have to send email. So you'll need a worker. Best guess is that worker will need Redis in some capacity. Who knows if you'll send those things your self or have someone else do it. Suppose you need to make that call.

So your bits and bytes need to run on a Ruby. There's a few to choose from. Suppose one runs on the JVM, you might need to figure that out. Either MRI or jRuby still need to run in an operating environment. Suppose that needs to run somewhere.

Or you could have just Heroku'd. Then you need to pick your addon providers and weigh why you've chosen on over the other. You'd still need to pick an appropriate web server, make sure your slug fits within memory tolerances, and configure said web server correctly.

Congrats, you've assembled JBoss (btw, it's not called now Wildfly) from scratch and picked a cloud. Hope you're actually making use of the components correctly.

Assumption #3: RAILS ALL THE THINGS.

Your marketing pages, support, landing pages, product pages, community engagement features, metrics/analytics/logging, everything. Rails.

Why? Because you need to share domain knowledge and Rails is the thing that ties all your logic together?

Does your root path page really need to be dynamic? Especially when the only thing visible on the page that has anything to do with the session is the "Logout" button? Legal page? About us? Do you realize that if you tie a session in here, it makes caching difficult to impossible in a CDN. Well feel free to try it, but feel free to cross the streams too.

Do you really need analytics calls crapping up your controllers (considering sometimes they can't be done in views via JS) when your Nginx/Apache logs are fully capable of tracking all that information?

So What Do?

I'm revisiting the four original videos, first in blog form and next in video. Get ready for:

  1. Nginx, your platform, and you.
  2. Static content, caching and fixing your layouts.
  3. Serving from the CDN. ESI when you need to.
  4. Doing BI right: Ditch your analytics and record everything.

Actually, my markdown process doesn't have strikethrough. This is going to be a lot more parts than four. Next up is about the needs of the company we're trying to work with

Rails, Uploaders, and Joyent's Manta

So yeah, I've been yapping about Joyent's stuff for a while now, but hear me out. Their Manta Storage service is completely awesome.

If you read the marketing material, it might seem sort of odd that I'm into this. At first glance it looks like it's for big Hadoop jobs and people looking for oil. I'm partial to thinking it's the best thing for Rubyists since delayed_job.

For the Uninitiated Rubyist

So you're going to need people to upload stuff to your site. Things like pictures of cats, avatars, spreadsheets, whatever.

The easiest approach is just to have people upload directly to your app server, your app server makes some adjustments, like making different sized thumbnails, and then shoves all the images off to where they'll permanently live.

There's a problem with that.

Tying up Ruby

If you take the easy approach, there's a pretty nasty issue. If you're using MRI Ruby and the average app server like Thin, your Ruby process is handling all the steps of making the picture available. That means the same process is not able to do things like, handle requests from other people coming to your site. Yes, that person uploading a 5Mb picture of their cat is going to completely ruin the experience for everyone else. Without any sort of code change, your best bet is process concurrency, that is, running another Ruby process to handle more incoming connections. This does not scale well.

The First Generation of Fixing This

So let's break this up into three steps.

  1. The browser uploads the image
  2. The server resizes images
  3. The server puts those images somewhere for permanent storage.

In an effort to offload the latter two tasks from your Ruby web process, the fine folks at Shopify created a Ruby gem called delayed_job. When the upload is done another Ruby process, one dedicated to your sweet cat pictures, takes over and lets your Ruby web process get back to what it's doing. Traffic gets bursty? Lots of cat pics incoming? Get in line!

Literally. This and the latter generations of "workers" or Ruby processes dedicated to doing this sort of work just queue up jobs until they're done. A lot of progress has been made dealing with this, my favorite has to be Mike Perham's Sidekiq -- I've used it to melt servers into small suns -- it's completely baller.

Keep this in mind for later: A full-time worker is going to cost you a chunk of change per month even for the times you're not using it and it has to stay up just in case someone comes along and uploads something.

The Other Problem

Cloud storage is pretty readily available, it's where your stuff is going to end up anyway, so why even bother with the intermediary?

There's a number of ways to accomplish this, via javascript, flash uploaders, or other methods like carrierwave_direct. Basically, when the upload to the cloud is done, a callback is triggered, your app persists the rest of the data that goes with the upload gets shoved into your app and you're good to go.

Then your worker has to go and fetch the image from the cloud, make your different sized images, and shove all those back into your cloud storage. This approach may sound like a complete pain in the ass, but it's actually one of the best ones available right now.

One Fell Swoop

Whatever that means.

So Manta. What is it? It's cloud storage with integrated computational capabilities. You can use it just like you would any other cloud storage service and just put things in it and retrieve them, but you can also create jobs on Manta so it can manipulate your data. You can even be fancy and use map/reduce functions.

This is where seasoned Rails people can start reading.

Manta's job queues work as such:

  1. You create as many tasks for a job as you want when you create a job, map only or map/reduce jobs using any language or tool of your choice (for the most part).
  2. You add your Manta asset as an input to a job.
  3. Manta processes your input. You can wait on it or just shove it in the job and go away.
  4. You can close your Manta job inputs for processing now. Billing ends. Did I mention you're billed by the second?
  5. Your assets are ready. Or if you didn't close your inputs, feed it some more data.

In practice, these happen as asynchronous callbacks, and happen very quickly. There's no limit to the resources you can throw at a problem, so very little time is spent waiting. Here, go read about it. It's in English and a quick read.

So yeah. I should stop. I should mention that there are services that will handle image resizing for you. They're pretty quick -- as soon as the uploads are done callbacks are fired that will process all 6 versions of the image you have to create in parallel and upload them to your storage bucket. It's pretty performant to work in this manner. Let's just hope that free developer account on that service doesn't hit its limits when you're on vacation.


So Manta is a whole lot more than images. Go look at the tools available in Manta. I'll wait.

So since we have this giant compute facility easily available, Why not store your Nginx logs in Manta? It's a no brainer if you're already running in Joyent's cloud. Oh look, built in GeoIP, PostGIS, and python language support. Why not figure out how many of your customers living on National Park lands within 50 miles of an ocean or BIlly Ocean click on your product pages for surf wax on sunny weekdays from June to September? 5Tb of logs? No problem, the data never leaves Manta.

Surely any of the top great services can transcode your video for you to the formats you need. Oh wait, a professional customer wants color histograms of their content as a QA process. No problem, you've got gnuplot, FFMPEG, and R. Yes, I used to do this at TWC, but it took hours and we could only do a few at a time.

There's something that's a bit more compelling to me as a developer about Manta vs pre-boxed services. It's not all that difficult to setup FFMPEG to do your transcoding work, but sure it's easier to use a transcoding service. What's compelling to me is that it's less of a stretch and more cost effective to have an entire toolchain at your disposal to make the next great transcoding service, and it's probably pretty effortless.


So let's talk price. You can check compute prices for yourself. In short, unless your cat pictures are also finding the cure for cancer, you're not going to be paying much. If you process 30,000 pictures this month and use a dedicated worker on another service, it's a fixed price, the worker is going to be often idle, at other times too busy, and as of this time it's going to cost you ~$40 for the month. Manta is going to cost you $1.20 and always meeting demand.

This may sound like nickels and times and not something a business would worry about, but multiply all those numbers by a factor of 100. How about a factor of 1000? $40,000 vs $1200 is kinda a big deal. And you never need to touch the magic scale slider. Seriously. This makes the Heroku magic scale slider hard to use.

There's something else to think about here. Want to re-process all your images to webp format? What about some other new format that becomes available? Be prepared to pay to get it in and out of your storage.

In the end it's hard to argue anything billed to the second for about any on-demand resource you could need.

So Let Me at it Already Danko!

For the brave, there's the official documentation. It's really pretty easy to follow. There's libraries for Node.js, Python, Ruby and they even explain how to access it with bash.

So tonight I spent some time hacking support into the Fog gem so it works with Manta. You can use my fork, but official support from official sources is coming soon, according to the Joyent dev I bothered while he was on vacation (sorry kevinykchan, I didn't know) Mine's pretty dangerous. Fog is a library for dealing with cloud resources, like putting things in it or starting jobs in a compute cloud. I've yet to try integration into Carrierwave, but for storage it should work. I'll have to spend some time hacking through carrierwave/lib/carrierwave/processing to add the rest of the goodies.

There are also CLI utilities you can install with npm install manta -g if you already have node installed.

Where after there?

I have no idea. I'm old and can't think of such things anymore. But there's a lot at your fingers, so go give it a shot.

Postslug zuul
No Comments, Only Zuul

Just got a complaint that there is no way to comment on a post here.

It's on purpose. I don't want to fight with you on the Internet or facilitate a way for people to do that.


Scale. What a bunch of nonsense.

Scale. I completely loathe the word. Having spent a good portion of my life in the "Enterprise", it's something I had to deal with a lot.

The word implies limits. Are limits good? Bad? Ugly?

Over the past few weeks, I've been on a serious journey on trying to redefining what scaling means to me.

Scaling your Ideas and Actions

I once had this crazy idea and ran it past my pal Leon while hanging outside the Neo offices in Columbus. The gist of it was an app where people would buy it, snap a selfie, I'd get it, then I'd draw their pic. The first thing I thought about was "well, what if everyone bought it? that's like 10's of thousands of pics I have to draw".

His exact response is kind of lost to time, but was along the lines of "so what IF everyone bought it?".

I suppose I'd have a completely different set of problems, but a whole lot of app store dollars. I'd find a way to deal with it should I have gotten 100,000 buys.

I'd realized that when I thought about scale, that I'd lost quite a bit of childhood imagination, which is what really makes awesome… everything. I'd dismissed something simply because the thought of scale made it seem impossible. Nothing is impossible.

Scaling your Arsenal

It seems like everyone is using computers for music these days (get off my lawn). In the early 2000's, there was sort of a backlash on this. I'd made the acquaintance of someone who didn't hate on the idea of music production on a computer.

There was a problem in that time period though -- the amount of different tools available to musicians was (and it still is) insane. Plugins, programs, hardware, there was so much going on. But then he let me listen to some of his material. It was amazing. I asked him how he did it and he said he just used one app and made himself work within the confines of it.

I'd been throwing different tools at the problem and not having the results that I got from Scott's productions. By not limiting my options, I'd gotten distracted and not expanded my own creativity.

The Middle

So as a programmer, I'm often faced with every day decisions that have qualities of both the above situations. Do I redesign something because I think it's too difficult? Do I give up after using (or not using) a third party library?

Do I go and be @Saterus (ha, sorry dude) and say that using Devise is mostly always a bad decision or do I try and work within the confines of that system to understand it fully for what it is?

I have no advice to offer you except for that finding your own tools for the toolbox is something you have to do on your own.

I could tell you all about the technical details, I could show you the research I've done, I could whiteboard the money you'd save by going with jRuby over MRI. If people listened to that kind of talk, we'd not have Celluloid (and by proxy, Sidekiq), EventMachine, or many other kick ass things in the Ruby world.

Lessons learned from Introspection

Just go do it. Explore your world. Find what works for you. Find what is a miserable failure and learn from it. be inquisitive, don't let yourself be set by previous failures. Question your own opinions. Sleep on things if you can and come back to them as a devil's advocate.

Don't claim ignorance as a reason for anything. If someone says something is compelling, don't take their word for it, go find your own scientifically based opinion.

Lastly, accept personal challenges. It's easy to convince yourself that your current habits and decisions are the right ones because it's your only experience. Go ahead and argue that MRI is better than jRuby, but be prepared to learn something and change your mind.

  • Mike

ps. jRuby is totally awesome, stop using MRI.

Postslug photo on 8 31 13 at 8.47 pm
Pair Programming with my Six Year Old Son.

I should be asleep. My daughter fell asleep at 7PM and will likely be up at 5:30, but I had to write about this while it was fresh on my mind. I think it completely changed the way I pair, forever.

I've been working on a badge system for communities to reward each other for things. I have another post upcoming about what's OK to give badges for and what badges are a bad idea, but more on that then.

He was off doing his own thing and I was off doing mine and I figured, "you know, I should really pair with him on this".

Best Decision Ever

We were just making some SVG icons for the project with Sketch 2. I think in hindsight, this was a WAY better idea than trying to work on the Ruby side of things. We could share and he could have an actual opinion about how something looked, how cool it was, how excited he was about it, or how terrible it was. There was a bit of "work" that had to be done on vectors, so it wasn't all instant gratification, but it was good to immediately see some result.

Immediately I thought to myself that I was not going to try and pair with a six year old on this -- he was going to be treated like any other pair I'd work with.

So we sat down and did the first thing every pair does: gets right back up again because we're out of snacks. I was going to refuse this at first since he'd already had some pretzels we'd baked earlier, but I'm treating him like any other developer, right? We have to be authentic about this!

So I took a few moments and explained the project. I had to assert that I didn't think he was paying attention once and that I really valued his input on it… no, that I REQUIRED his input on it to have it be successful. From this moment on he was IN.

I drove, but explained the things I was doing. We already had a badge template and needed to make some more badges. I told him about pairing at work and what I should expect out of a pair.

He started on the path of "oh yeah, that's good, I like that." He had lots of good things to say, But he didn't tell me what he disagreed with.

I said he should by all means tell me if something was terrible -- even if it as just a feeling that it was terrible and couldn't tell me why.

A Six Year Old Has an Epiphany, and so does his Dad.

The gears were SMOKING in his head. In a more words than I'll put here, I'd said that if he knew something was terrible and could tell me why, then to go ahead and do so. I also said that if had a feeling something was terrible to not say that it was terrible, but to say that he had a feeling it was bad and that I'd help him from there.

Of the things you'll never learn in school, I think there were about a dozen in those statements that you need in life. We immediately went into a feedback loop on everything, and sorry to the serious geniuses I've worked with, I had found the most effective pair I'd ever found. Granted, it was only through opening my own eyes that I could be the other side of that.

What is effective pairing but being inquisitive and childlike in your approach to problems? Jonathan is really attune to how others perceive him, as I think a lot of kids are, and it was completely mind-blowing for him that being successful on a project was a result of being able to be himself.

From reading him, I noticed that he relaxed significantly now that there as no expectation that this was going to be a heads-down work-mode sort of thing. And that's when my son turned into a full-blown first-class developer for an hour.

"Let's get more snacks!"

"That should be blue!"

"I don't like the shadow on that!"

"The pretzel is too squishy" (not the ones we were eating, but the one in a badge.)

"What's a gradient?"

I've always known that explaining even a feeling was something that was a success factor in a good pair, but watching that click in someone elses' head for the first time is a heck of a re-enforcement.

Now Jonathan, this is when we Github, I mean share.

Mind blown again.

"So Jonathan, we're going to put this up on the Internet where people I do work with can see it, make new badges, make our badges better, or do whatever they want really".

I'll let the README explain the rest: dankobadges README.

There was a moment of elation for him when it clicked -- DADDY GETS TO HAVE THIS MUCH FUN ALL THE TIME?!?

The Aftermath

I've not seen this level of meta-learning for a child in a long time. As he gets older, moments like this in schooling will become increasingly rare. I find this as something that makes me sad as well, it's how adults actually work together and get things done. It's more important for me to see meta-learning skills develop than for him to know many facts, and wow, did we push the envelope with a couple of icons.

I'm going to do my best to set aside two hours of pairing time for Jonathan once a week. I think this should be cirriculum in grade schools. Even if it was, I'd still do it, I've got things to learn too you know.

ps. The one with the bad word on it was hidden from view the entire time.

Things I need to blog about
  • Pairing with my six year old on a project. So epic.
  • Subjective Outcome gaming topics.
Fatherhood is an Application Stack.

I have two children: Jonathan age six and Cate, age 3. An interaction we just had reminded me of a deployment stack.

Cate: "Daddy my puppet is broken! Daddy my puppet is broken!."

Jonathan:: "Daddy can I use the computer?"

Me: "Cate, slow down, one thing at a time. Jonathan, I'll help you when I'm done with Cate" (HAProxy)

Cate: "Daddy stick puppet is broken."

Me: "What's wrong with your puppet?" (HAProxy queues and load balances to Varnish #1)

Cate: "It needs a shirt" (Varnish checks her request)

Me: "Cate, I just gave it a shirt".

Cate: "No, a new shirt" (Cache invalidated.)

Me: Cutting out new paper, Applies new shirt with gluestick

Me: I assemble the old popsicle stick, with its new component. (Content delivered, Stick is a page, I suppose it's torso is now an Edge Side Include with a lower TTL than the rest of its wooden body). I deliver it and tell her I can't make more paper clothes for it for a few minutes. (Setting content expiry)

Cate: Cate looks at the delivered, whole, same popsicle stick, with it's clothing as a new component. "Daddy I…" (Client request)

Me: "Jonathan needs my help, figure it out yourself" (HAProxy routes to Montessori Education Server)