Welcome
I'm a software engineer. I'm currently running the tech side of the house at ThinkNear, changing the way small businesses find and attract customers.
I like scaling big, distributed systems, finance, politics, crossfit, football, being productive, learning new things, and mastering old things.
Profile
Summary
Experience
- Oct 2010 - PresentHead of Software Engineering / ThinkNear
- Feb 2007 - PresentSoftware Development / Amazon.comEngineering experience building a targeted marketing, content management, and rendering engine for all of Amazon's credit businesses. The system supported real-time customer identification and segregation based on business supplied heuristics, delivering personalized ads and supporting the ensuing credit application workflow. Also built a full, scalable solution supporting instant credit approval. Developed Amazon's next generation payment instrument management system. Responsible for securely storing credit cards and bank accounts, and aggregating access to dozens of methods of payment. Oversaw the scaling from 0 to dozens of clients, from tens to millions of transaction per hour while maintaining 99.99% availability. Managed the Merchant Ordering Experience team and owned the 11 related services. The team's scope was to support third party order fulfillment on the Amazon platform worldwide; responsibilities included team, project, and product management as well as technical leadership. In addition to the back end services and databases, the primary application had 14 merchant facing pages which together are the highest trafficked part of the sellercentral.amazon.com site. The primary application managed seller interactions with Amazon during order fulfillment: tracking orders, confirming shipments, and managing returns/refunds. Third party orders had an annual run rate in excess of $10 Billion. Primary technology experience was in Java, both Jetty and Tomcat platforms. My teams designed and built massively scalable and incredibly redundant database architectures using Oracle as our primary data stores. Gained lots of experience with concurrency, bottleneck and performance analysis, distributed systems architecture, and high pressure trouble-shooting. Other noteworthy technologies used included: EHCache, Hibernate, Perl, Mason, SQL, and a significant number of Amazon proprietary systems.
- May 2005 - PresentSoftware Engineer / Armonicos Co. Ltd., Hamamatsu, JapanDeveloped solutions for CAD products in C++. Implemented a new file format for handling large files which reduced file size by 75%, save time by 90%, and load time by 98%. Introduced scripting capability based on COM standards, allowing users to script operations in VBScript.
- 2004 - PresentDeveloper / CenterPoint GmbH, Villach, AustriaDeveloped WSDL/SOAP-based distributed communication solutions in C++. Retrofitted a proprietary RMI library to support WSDL-defined communication across platforms. Upon completion, the library was capable of dynamically offering, discovering, and consuming web services.
Education
-
2001 - 2006University of WaterlooB. Math in Honors Computer ScienceActivities: Sigma Chi Fraternity -- extensive involvement including President, Vice President, Treasurer, and more.
-
1996 - 2001South Secondary School
Additional Information
Updates
-
Playing the Michael Bolton captain jack sparrow snl video in the bar right now
-
The thriving data ecosystem in NYC http://t.co/eswoKDE0 via @wordpressdotcom
-
http://t.co/odN9PwyB Did not see this coming. #isjavafree12 days ago from web | Reply, Retweet, Favorite
-
@zefer It's off an on for me. Not really usable. Know any competitors worth checking out?2 weeks ago from web | Reply, Retweet, Favorite
-
Who are Airbrake's competitors? We're on the market for someone else.2 weeks ago from web | Reply, Retweet, Favorite
-
@redistogo If we upgrade our account, does our data get migrated? Or would we need to migrate/recreate it manually? (we're on heroku)3 weeks ago from web | Reply, Retweet, Favorite
-
Big day at in n out. Go figure #420
-
@declanm where can I get on a waiting list for Calyx5 weeks ago from web | Reply, Retweet, Favorite
-
Pro Tip: move last weeks coffee further away from you than this mornings coffee.5 weeks ago from web | Reply, Retweet, Favorite
-
@Delta 's model appears to be "wait in line until you're late". Only first class and late people getting through.
-
6 weeks ago from web | Reply, Retweet, Favorite
-
@DanDotLewis any chance Now I Know has an RSS feed?6 weeks ago from web | Reply, Retweet, Favorite
-
@erictj false! Terminal not zsh without loaded zshrc not a terminal. Is source of frustration.
-
Huh. Roku's can crash. Guess it hasn't been a very productive weekend.
-
Building for the mobile use-case. http://t.co/M72LUUP1
-
I'm a #TechStars grad & its the best thing you can do for your startup. Final deadline 3/16 for Boulder! http://t.co/YVJ7HSNr cc: @nglaros2 months ago from web | Reply, Retweet, Favorite
-
Why Isn’t Mobile Display Advertising Huge Yet? http://t.co/opjXOGes @eportnoy
-
congratulation to @engadget for having the only working live blog (linked from macrumors) of apple's event http://t.co/wn3aRZOl
-
@bdotdub still lots of Ruby. Just need some Java, too. Someday, Ruby will be faster.3 months ago from web | Reply, Retweet, Favorite
-
New Blog Post from SoftwareGravy: SynchronizedTreeSet faster than PriorityBlockingQueue -- http://t.co/PmsNijFU
Posts
Prediction: Eric Kessler will change his views on cable cutters, or he will no longer works at HBO. Only a matter of time.
CS 193H High Performance Websites
Good work Stanford! Producing graduates with applicable skills — so valuable.
I’ve thought Universities needed more applicable tracks for a long time. I’ve head people argue against me, too, and I’ve never felt their arguments held much water. One that’s been brought up a few times is that “they teach concepts applicable to lots of scenarios, not specific implementations”. I think you can learn how real-world people have applied the concepts to real tools, debate their decisions, and come off with a well-rounded understanding of the theory plus an understanding of a specific example. As an employer, I’d prefer you to know how to tune a garbage collector — any garbage collector — than not. The concepts port well, but if you’ve never done it before, it takes a long time.
I think the real reason there isn’t more applicable classes at a lot of Universities is because the profs can’t teach them. A lot of profs have never worked. They got a Bachelors, a Masters, a PhD, and now they teach. Or else when they did work it was decades ago, or maybe in a research center, or wherever — regardless, I don’t believe there are many professors around with Twitter, Facebook, Amazon, or Google experience. By the end of their PhD program, they’re under-qualified to teach anything practical, because they may not know what is practical and what isn’t. (To be sure, this does not apply to all profs.)
Don’t get me wrong, PhD work is incredibly valuable, and leads to amazing real-world implementations of breakthrough theories, but the professional implementors are generally underrepresented at Universities. Students graduating with a Masters or a PhD are often less qualified for practical software engineering jobs. Stanford seems to be addressing this, and I hope other Universities follow.
Other classes I’d love to see:
CS 19X Latency-Constrained, High-Volume Services
- load testing
- tuning the JVM
- detecting and determining bottlenecks
- scale out, not up
- estimating hardware needs
- service redundancy
- DB sharding
- DB failover strategies and implications
- DB growth analysis and strategies
- seamless deployments and rolling back
- threading horror stories
- server security and authentication
CS 19Y Unix Administration for Developers
- shell programming and the profile/zshrc/etc
- applications run as users
- what’s listening on that port? (and network strategies)
- server tuning and hardware selection
- logging
- AWS-applied (from zero to service stack)
- backups, archives, storage, and recovery
- DNS
- connectivity
- keys, passwords, tunneling and other security concerns
CS 19Z Practical Development Practices and Tools
- source control
- unit testing
- integration testing
- continuous build & test, code coverage
- dependency management 1, POM Hell, an introduction to Maven
- dependency management 2, bundler internals
- deployments
- REST vs SOAP
- real-world commenting, READMEs, and technical blogging
- bug tracking & sprint planning
I just read a blog post off hacker news: Why loading third party scripts async is not good enough. It reminded me of someone I used to work with at Amazon who would regularly find errors in our applications. This was quite a feat at Amazon because we instrument everything. We have regex’s constantly parsing logs looking for errors, we have a dozen kinds of monitors collecting host metrics, server metrics, client metrics, business metrics, coffee temperature metrics, etc. all constantly checking “is your cpu load high?”, “do you have enough free memory?”, “how many times did you show pictures of the Twilight case?”, etc.
This one engineer (on a team of exceptional engineers) was consistently the only one to find errors. It was definitely very healthy for the team but …. engineers secretly hate this because, by definition, finding errors means he’s pointing out faults in your work. Managers less secretly hate this because it means he’s ‘creating’ high priority work that gets addressed ahead of their projects.
So with all these metrics and monitors on a team of high achievers, how did this one person on our team keep finding errors? He looked at the logs.
That was his secret weapon; reading logs! It’s like grade 1 of service maintenance. With all our monitoring, regex’s, and features we thought we were too good to ‘just’ read logs. The rest of the team would release features, put regex’s to detect our errors, trace a few requests after launching, and then move on to the next project. I honestly don’t know how much time he spent on it, but every week or two, he’d come in and explain how our programs were messing something up.
Things like:
- requests to a dependency fails. We monitor overall failures, and accept failures of less than 0.1% (just hiccups and connection problems, right?). Turns out our dependency never worked for 0.1% of our customers.
- we have a dependency known to have errors, but retries often succeed. We will retry every request once before raising an error. Our dependency makes a change which we don’t notice, but our retry rate goes from 2% to 50%.
- you have ‘targeting’ params which you consume if available (i.e. the http referrer header). You make a change which loses this data in the course of a request and now you’re never using it to target.
There were three morals of this story:
- Drill down into your metrics and understand where they are coming from (and their deficiencies)
- Your monitoring will never be perfectly reliable — you regularly need to just randomly re-verify things are working
- Every time you catch a problem, install the proper monitoring to make sure it never happens again
In my experience, the most likely error is one you’ve seen before.
Applying for a job as a software engineer?
The odds are you have a bad resume (since >50% of the resumes I’ve seen are bad)
The Resume
Objective
No! Waste of space. It can only hurt. Your objective is to get a phone call or email. Don’t apply for jobs you don’t want. If you’re unsure, just apply like you want it and ask questions in the first conversation.
Length
2 pages max, 1 page if possible
Format
Prefer PDF — it’s more universal. Windows is losing market-share in development circles. Word is not universal.
Check the job description — if they indicate a format, follow it.
If you do use MS Word:
- email it to yourself and verify it looks good in Google Docs (I’m not going to download it)
- email it to a friend with a Mac and verify it looks good (some people do download it)
Credentials
- CS Degree at the top
- Certificates at the bottom or omitted
- Github account at the top (as a link)
Experience
- One entry per company
- Include years worked there
- Multiple roles => more bullet points
- What was your role? What did you do?
Technologies
- SMALL LIST (wow, you know XML? Really? Cause that’s hard to find and hard to teach … not)
- technology stacks and platforms are enough
- don’t list things you would be uncomfortable interviewing in (with 1 days notice)
- Seriously! You list it, the interviewer can ask about it, and expect you to code in it
Good Examples
- Java: Tomcat on AWS with MySQL
- Ruby: Rails on Heroku with Postgres
- Web UI: HTML/CSS/JS on PHP
Bad Examples
- Java: Tomcat, Jetty, Maven, Ant, JUnit, XML, SOAP, Hibernate, JSP, blah blah blah
- Ruby: Rails, Rake, JSON, Devise, Cucumber, Webrat, RSpec, VCR, FactoryGirl, ….
Small caveat
If you’re applying to big companies that are tech ignorant, you might need a laundry list to get past HR.
Extra Curriculars
- yes if you’re just graduating for college
- yes if they’re significant
- for non-new grads, at most one line (unless you’re putting in 5+ hours per week)
- don’t list every club you’ve ever been a part of
- don’t list activities that include demographic information (Gay and Lesbian orgs, Christian Missionaries, etc)
Don’t list things that some people “don’t get”. No need to mention your involvement with D&D, Video Game communities, or Twilight fan clubs
Cover Letter
Generally not required.
When is it a good idea?
- when you don’t have a CS degree
- when you’re applying out of your depth (UI developer applying to be DBA?)
- when you really want the job and can speak intelligently about the company
When is it a bad idea?
- when you copy paste the same cover letter every time
- when you have nothing to say (i.e. you’re summarizing your resume)
Coding
Active on the job market? You should have one of three things on your resume
- a great school (Stanford, MIT, Waterloo, UIUC, Carnegie Melon, etc)
- a great company (Google, Amazon, Apple, Facebook, or a known startup)
- a Github account with a bunch of stuff in it
There's a very undervalued skill I've observed over the years writing software, and that is complaining (to be valuable, you do have to be good at it). I've observed it in Amazon's trouble ticket system, in open source communities, asking questions on stack overflow, and pretty much anywhere else engineers interact collaboratively. Learning to complain accurately and precisely is an incredibly valuable skill and is a hallmark of great software engineers.
Part 1: Waterloo fails a lot of studnets majoring in Computer Science.
| Start Year | Student Count | Degrees Awareded | Percentage |
| 2005/06 | 280.8 | 222 | 79.06% |
| 2004/05 | 416.8 | 215 | 51.58% |
| 2003/04 | 548.2 | 339 | 61.84% |
| 2002/03 | 667.5 | 354 | 53.03% |
| 2001/02 | 604 | 448 | 74.17% |
| 2000/01 | 668.5 | 514 | 76.89% |
| 1999/00 | 618.5 | 462 | 74.70% |
| 1998/99 | 548.5 | 421 | 76.75% |
| 1997/98 | 641 | 387 | 60.37% |
| 1996/97 | 513 | 310 | 60.43% |
As I get deeper into AI, I realize that there was a math class I never had which would have been incredibly valuable as an undergrad. This class should have been day 1 of my undergrad, perhaps repeated every year, and certainly addressed for 20-30 min of each math course I took. That class would simply be a survey of mathematics, or maybe just, math from 10 000 feet.
When we interview for technical ability here at ThinkNear, we are seeking raw talent. Raw, not to mean that you’re lacking experience, or applied ability, but to mean that underneath everything you can do, there is real, innate talent. The thought process is: anyone can get pretty good at something if they put in long hours, but some people have learned the fundamentals so well that they can learn new tricks much easier than others.
Technically, this means we want you to know a few languages, just because knowing multiple languages helps you be better at all of them, but you should know one really really well. The punchline is that I don’t care which one. Whichever one you choose, though, I hope you know it really really well.
If you know Lisp really really well, then I think you can pick up Java and be better than a lot of Java programmers quickly. Or Ruby, or Python, or C++, it doesn’t really matter, because you’ve learned one language so well. You’ve gone deep, optimized, written your own lists to suit your needs, or written your own packages to upgrade a toolset.
When we look for technical smarts at ThinkNear, we want to see evidence of your skill and past accomplishments. Today, on a blog post about .Net, my point was made great a couple of times. Here’s the post: http://samsaffron.com/archive/2011/10/28/in-managed-code-we-trust-our-recent-...
If you’ve tuned .Net, you are probably well qualified to tune Tomcat, or Django, or Rails. You just know what all the types of bottlenecks are.
I’ve had this on my mind for a long time, but Joel finally gave me a concrete example to make my point.
Obligatory Old Timey Story: in the 1980s garbage collection was a real performance problem on Lisp machines, which were lucky to have a meg of memory. There were a few elite Lisp hackers who practiced what they called “cons-less programming” – cons is the only thing that allocates memory in Lisp – to avoid GC. Since cons is really the fundamental building block of just about everything you do in Lisp this was kinda difficult, but the technique is almost identical to what you’re describing here. — Joel Spolsky
Roughly translated, if you were good at something a couple of decades ago, there’s a good chance you’d have the right intuition to help on this problem. The specific skills have become obsolete, but the raw talent would continue to pay dividends.
I just went to pay my bill on the LADWP (don't worry about it), and I didn't know what my username was. I have had a few over the years. Turns out the one I had used for this side was 'jehinnegan'. In order to be reminded of this, I had to supply by email address and then answer one of my security questions.
I posted on Hacker News today about trying to hire AI/big-data people, and what we're looking for (shameless plug: we're hiring careers.thinknear.com). Aomeone sent me an email asking about how to get into the field from college. Here was his question (slightly edited).
I'm just under two quarters away from finishing my Masters in Computer Science (..). I'm (..) in the process of evaluating my post-graduation opportunities at the moment, and have found it hard to have any real goals beyond wanting to be good at machine learning / data mining at some point in the future. I was wondering if you could offer me any advice about any particular things I should be focusing on at first, or if there is a particular progression that you think would make the journey easier?
My response was a lot longer than I had intended, so I'll share it here. It's not terribly well structured, but hopefully someone else will get some value.
First and foremost, you need to be able to code. There's a level of competency that a lot of engineers can hit within 2-3 years of graduation, that is basically 'professional competency'. It's important to hit this early. People with your skillsets have to be able to model on their own (that is, get something working) and then be able to supervise/advice teams that implement full scale versions. In a lot of companies, if you can't prototype, you're not terribly useful.
I would not shy away from big companies, but be very specific about what you will be doing or what team you will join. A lot of big companies (Google, Amazon, etc) may allocate new grads somewhat randomly or where they need to put them. If you go the route of a big company, just keep in mind that you have all the power -- you can choose where you want to work down to a team (as long as that team is hiring), because you can always go somewhere else. They will accomodate your special requests if they like you. Having a big name on your resume early on will really set the tone for your career. Working at Google is kind of like going to Stanford -- it matters as much that you got in and didn't get fired as it does what you did there.
On the flip side, you will probably learn more practical skills and have more fun going to a smaller company like ours that need hands on data people. There's a lot of 'career infrastructure' to get in place, and the sooner you can do that, the sooner you can work on the cool stuff. The 'infrastructure' is stuff like: getting good at setting up hosts and environments and experience writing software well (using git, editors, code reviews, troubleshooting bugs from production). Compared to academia, there's also something very alien about working on code for years, working on a product that is tens of thousands of lines of code running on dozens of hosts, and maintaining code that was written 5 years ago by someone you've never met. These are just valuable skills wherever you go.
If you pick the right small company, it's better career wise than the wrong one, but the big companies that everyone knows are probably safer (possibly not as good career-wise, but pretty much guaranteed not to be bad). If you have debt, it's hard to go wrong going to the big companies which definitely pay a premium (I say think its because they pay you to write software AND to tolerate a bunch of bureaucracy). If you go to the startups, really meet and like your boss and the founders. You're basically betting a lot of money on their success, so be sure you want to do that. Don't buy hype or sales. Look at their resumes -- look for experience and a track record of success. (Or apply to YC or TechStars and try to start your own -- you'll probably fail, but you'll probably learn the most about yourself and life this way.)
Side interests are really important. I just searched you, and didn't find anything on Github. Create an account today. Just upload assignments you have from school, or anything that you've coded (provided you're allowed). Get a bunch of code samples out there. Go grab a great text book and solve random problems, and post your solutions. The goal is really just to show activity and interest more than that employers will look at specific code samples. If you can, get involved in an open source project -- just fix a bug or two and get your name out there as a contributor. If you're too busy, then just read blogs and ask questions. People love it when you comment on their blogs, and are more than willing to help, even if the question is not 'a good one'.
If you're a writer, write a blog. This is also a good idea if you know you suck at writing, as it's an opportunity to practice. Pick an algorithm or something and just discuss it or explain it in your own words. If you ever implement anything out of papers, discuss that. Here's about the best example of a niche blog I've ever seen: http://blog.echen.me/ Most of what he does is over my head, but I know he's really interested in AI and data analysis. If you're not a writer, just be active in the comments of other people's blogs. You have a very unique name, which is a great asset in the world with Google, but when I search, I get 1 page. There's no excuse for the first 3 pages of Google not to all be about you -- on twitter, commenting on other people's blogs, asking or answering questions on stack overflow.
Okay, I wrote much more than I had intended. The bottom line, is as a new grad, you need to get the 'software engineer' check box out of the way so you can get into AI and Data work. Ideally, find a company where you can do that in the context of where you want to go. Analytics groups at Google or Amazon, Search at Google, or even DuckDuckGo. Personalization and recommendations at Facebook, Linkedin, or Amazon. This early on, the wrong startup can be a land mine for your career (though the right one can be a gold mine). Once you get 2-3 years of solid software experience, if you're not onto data-mining and AI problems at your current job, then it's time to shop around. Just don't be in a hurry. Realize that you'll be going up against people with decades of experience and PhD's. You can leapfrog them over a few years, but don't think you'll do it 1 year out of college or underestimate how hard some people work on improving themselves on an ongoing basis.
Be humble, go where smart people are, and learn from them as much as you can. Realize that learning is ongoing. The most talented people out there are spending hours every week in study and practice not to mention their professional endeavors. Strive to join their ranks.
A public service message to all colleges and universities:
It seems that every few months another article about how to ‘hire a technical cofounder’ comes out — written for the multitude of would-be CEO-type co-founders. From the other perspective, the would-be CTO’s, I’ve not seen a lot of advice on how to choose a CEO. This is potentially a more difficult decision because you probably have more options and evaluating CEO talent is less objective than evaluating CTO talent (which is still not really objective, but more objective than evaluating CEO’s). The following are some criteria that I used, and would use again. I would say you need to answer an unqualified yes to each point, or you shouldn’t start the company.
0) Take your time. Don’t pick a CEO in a weekend, or a week. Take at least a month, if not more. You’re betting your time, energy, and (through opportunity cost) thousands of dollars on your partnership — don’t make the decision quickly.
1) Potentially the most obvious: is the CEO passionate about the idea? and are you just as passionate? If the idea is originally the CEO’s and he’s the first person you hear about it from, give yourself a month or two before committing. Wait for the ‘honeymoon’ to wear off, and make sure you still really care about your would-be customers and how you plan to serve them.
2) Do you agree on the premise of the business. It’s important that the you and the CEO are on the same page with regards to the premise or overall goal of the business vs a specific idea or implementation. Your first attempts will not work.
3) Have the conversation about funding and exiting. Are you going for VC money or planning to bootstrap? Are you open to being aquisi-hired by Facebook or Google? or are you only trying to be the next Google?
4) Would you bet $100, 000 that the CEO would find a way to succeed, with or without you? This is essentially what being a technical cofounder is doing. Presuming you’re well qualified, it is unlikely you will be able to pay yourself more than half what Google, Microsoft, or Amazon would be willing to pay for your services. You will effectively exchange the difference in liquid compensation for illiquid founders equity. Over a couple of years, the cash-equivalent difference is likely to exceed $100 000. So, before you choose to work with a CEO, make sure you would be prepared to bet $100K on his success. (If you think it’s a good bet, but not without you, then why would you partner with him?)
5) Is your CEO distinctly qualified in multiple ways? Before starting ThinkNear, Eli had run his own business, gotten his MBA from Harvard, and worked as a product manager at Amazon. If he had just one or two of these 3 ‘resume bullet points’, I’d have been far less likely to join him.
6) How great are his references? Get references. Get lots of references. It’s the easiest thing to check on, and a great CEO should be able to product dozens of great references on demand. If he doesn’t have a bunch of (non-technical) friends or colleagues who would start a business with him at the drop of a hat, something’s not right. Yes, at the micro level, each reference is hand-picked as someone who strongly endorses the person. You’re looking for the macro — does he have a lot of people with great resumes who would strongly endorse him?
7) Can he sell? Your CEO needs to be able to sell your product to customers, sell your company to investors, sell you to potential hires. You need someone who’s great at selling on the founding team, and if it’s the technical cofounder, you’ll be in trouble.
8) How do investors perceive him? This matters whether you’re fundraising or bootstrapping your way to success. You need a CEO who investors like, get along with, or at the least respect. Lacking respect will kill your company if you’re planning to fundraise (if investors don’t like your CEO but still respect him, it will just be harder but probably not impossible). Why does this matter if you’re bootstrapping? Because investors (by and large) know what they’re doing. If they don’t respect him, there’s probably a reason. At least one of the CEO’s references should be an investor.
9) Do you get along? Go grab some beer, and hang out for a couple of hours. Do this a few times. Bring up contentious issues — religion, politics, company values, etc. You’re trying to find something you’d argue over and see how you handle different opinions. Talk.
10) Does the CEO have similar economic standing (not off by, say, more than an order of magnitude) as you do? This should probably come up in prior points, but if selling for $30 mil vs $300 mil is the difference between life-altering sums of money for one of you and not the other, I would seriously consider not partnering.
11) Does the CEO have a track record of success? I would be more inclined to go work with someone on their first attempt at a company than I would someone who has started and failed at two or three businesses. There’s something to be said for persistence, but there’s also the track record. Without having any idea who the person was or anything about the companies, someone who has started more than a couple companies (which are not just consultancies or other lifestyle type businesses) and failed to grow or exit any of them successfully is either lacking some skill, competency, or trait; or is unlucky. Either way, I don’t want to bet on them.
There’s my list. Hopefully someone finds value in it.
Hack their products to do something cool.
Software Engineering is a skill. It’s like playing the piano, playing a sport, or being a surgeon. It takes a lot of practice and experience to reach a level of competency, and then an order of magnitude more to reach a level of mastery (or become a ‘rockstar’). I’ll be borrowing the musician analogy.
People who do not practice a skill may not be able to discern the difference between good and great. Do people who are not trained musicians expect to be able to tell the difference between good and great musicians? Probably. Can they? No. The same holds true for Software Engineering. Even within a group of people with a skill, deciding who is ‘better’ is at least partially a subjective process.
Just like other skills, software engineering takes a lot of time and effort to gain mastery. How much time? Probably in the neighborhood of 10 000 hours, maybe more. On top of that, it must be the right kind of effort.
To non-professionals, and even to software engineers early in their career, they may see some of the “kids” coming out of college who are rockstars and think: “it must not be that hard”, or “I can do that”, or even “that’s what I do”. The reality is that many of the “kids” coming out of college have over a decade of real practice behind them. Many start coding before highschool, and spend evenings and weekends practicing.
To think you will transition into software engineering / computer science and then go get a job at Google/Facebook/Amazon/etc in a couple of years is a lot like picking up the guitar and planning to star in a rock band in a couple of years. It’s been done, but success is uncommon. From personal experience, I also believe the effort required to become a great software engineer is systemically underestimated, and personal abilities are systemically overestimated.
I do not say this to dissuade those from entering the profession or dabbling on the weekends, but I think they should be realistic about the size of the task they are undertaking. That’s also not to say that being a rockstar is required to realize success. Pairing moderate programming skills with specific domain knowledge is certainly valuable, and, unlike musicians, there is significant demand for those who have not yet reached a level of mastery.
Through most of this article, I use the term ‘software engineering’, don’t I mean programming? No. This is another piece of the puzzle that is non-obvious, but there’s a very real difference. To keep the music analogy going, programming might be like playing an instrument, and software engineering is writing music. With music, you probably need a mastery of the former to be good at the latter, but it’s not necessarily a requirement (you can sell your music without having to play it). However, in Software Engineering, you usually have to play your own music and are judged on the final outcome. Great programmers without Software Engineering talent will play bad music exceptionally well, but it’s still bad music. Great Software Engineers who program poorly will have fantastic music in front of them but be unable to play it well. And rockstars are great at both.
The point of all this discussion is that we need to treat Software Engineering more like a skill and less like a profession. The skill must be developed and grown over time. You can lease out your skills for money, but doing that too much on jobs without challenges will cause your skills to stagnate unless you’re putting in extra practice. Working on the best challenges will help you improve significantly.
Also, a degree in computer science does not mean you are good at the skill, it just means you understand or have an overview of the skill, and (in the case of most post-grad degrees) that you have at one time demonstrated the capacity to competently solve problems in the space. A degree in computer science considerably improves your chances of being or becoming a ‘rockstar’, but is neither necessary nor sufficient.
This line of thought yields some advice that I think is particularly relevant to those in college or early in their career:
- Get really good at a programming language. I mean, really, really good. It probably doesn’t matter which one, but you need at least one. Getting really good probably takes 2-5 years of daily use.
- Don’t use an IDE when practicing. If there is a correlation between mastery of programming and mastery of an IDE, it’s negative (though mastery of an IDE is certainly an asset when not actively learning / practicing).
- Don’t use built in libraries you can’t write. If you can’t write your own hash/list/set/whatever, then you shouldn’t use the built in ones. The only way to know you can, is to do.
- Always be learning. Regularly learn new programming languages, algorithms, libraries, and technologies.
- Find a niche early, develop it, and then branch to other areas if you wish. Similar to 1, but conceptual. (A niche might be distributed systems, databases, UI, etc.) Also, realize that once you’ve been in a niche for 10 years, it’s very hard to change.
The best part of a startup is the opportunity to realize your own potential. There is no upper limit to the amount you can accomplish, so it's an exercise to see how far you can go. It's like entering a race, but no knowing if you're in a sprint, a marathon, or an Ironman. The result is that you have to treat it like an Ironman, but push as hard as you can at every opportunity you get just in case it turns out to be a sprint. And, as anyone who does crossfit or other endurance sports can identify, you need to be pushing hard 5 min into a workout in order to have a good time at the 20 min mark -- just not quite so hard that you fall over at 10 min.
To pick an analogy, being in a software startup is like running a marathon over unknown terrain, unpredictable weather, many paths to take and not knowing where the finish line is. Maybe you take the path that is lined with cheering fans, but you can't see a finish line ahead. Do you keep going down the path with the fans? Or do you turn and take the deserted path? Is it harder when some of those fans cheering for you have been in the race before? What if you know they've bet on you to win? How does that change your perception of the advice and the path you select? What if you decide on a path that the fans disapprove of, can you continue without their support?
What if you come to a mountain, and you think there's a chance of a finish line at the summit? Do you run up the mountain? Up thousands of feet of elevation through sleet and snow? Or do run through the sunny meadow next to the mountain? There's nothing to say that there is not a finish line in the meadow -- maybe even a better finish line. If you choose the mountain, and reach the top, but find nothing, do you have the energy to try another? Was running up your mountain still worth it?
Through all this, there are other racers around you. Some are climbing mountains, some are in the meadows, and some are spending a lot of time trying to choose the right shoes to wear. Some are faster, some are slower, some have whole relay teams helping them get there, some have been training for years, while others can barely walk. Do you like to run in crowds? with friends? or solo? Will the other racers be there to help you, supporting other people running a race? Or will they be trying to get to their finish line at the expense of those around them?
When you come to a fork in the road, you can see a runner ahead of you down one of the paths. Do you follow the runner you see? After all, they were recently where you are, and chose that path as the better one. If the other runner chose a path leading to a finish line then you're going to have to catch up to them before they get there. Or you could take the path not traveled. Is doing something different than others valuable in and of itself? All finish lines are not created equal, nor are they placed with much reason. After years of struggle, are you the kind of person who can find a finish line and choose to go find another hoping it will be better? Some finish lines will win you rewards beyond your wildest dreams, others might offer a warm place to rest -- most will be in between the two. How much do you want the warm place to rest?
To sum up my thoughts on the analogy:
- Run as hard as you can every day. Rest when you must, but you might be in a sprint.
- All fans are not created equal: listen to those who have seen many races, have run the race before, and have bet money on you to win.
- Lots of people have run in meadows, few people have climbed mountains. Learning to climb mountains faster than anyone else is a valuable skill.
- Focus on the race. You're unlikely to reach the finish line ahead of the leaders -- so make sure you're a leader.
- Don't spend too much time choosing your shoes. Get in the race.
- The more you run, the better runner you become. Taking the wrong path 100 times will leave you a stronger runner than when you set out.
Finally, be in it for the race, not the finish line. If you enter a marathon because you want to cross the finish line, get the medal, and some free Gatorade along the way, you'll probably give up after a few miles you realize how hard it is. The people who compete in marathons do it because they love running, competiting with themselves, and the journey. Crossing a finish line is just a goal to help with motivation.
Lets start out by stating that I have a lot to do. I am not in the situation I once found myself in college where I had 3 tasks to do by Friday, and I could do them today or Thursday night. That's procrastination, and not what I want to talk about. I have orders of magnitude more work on my plate than I could hope to complete in months and I have a very real incentive to complete them as fast as possible. I am highly motivated and focused on maximizing my "engineering throughput". The only real governing factor is avoiding burnout.
What I'd like to talk about is a state that I enter when I have a tonne of work, I sit down to do it, and end up spending a lot of time on the web (on non-productive sites). For me, the symptoms are that every time I run tests or install gems (tasks that take a variable amount of time from 20s-2m), I check if anything new is on TechCrunch/HackerNews/GoogleReader, and may end up getting side-tracked for anywhere from 5-15min. For others, it might be chatting on IMs, or they might get sidetracked fixing a bug in an open source project, whatever. Individually, these distractions are not that important, but in aggregate, they can kill a whole day or longer. (On a related note, 'procrastinated' is also what happens to developers working in corporate environments that have 3 30 min meetings equally spaced throughout the day.) So why is what I'm describing not just "procrastinating" or "not working"? First off, unlike procrastination, being procrastinated is not voluntary. You don't leave the office thinking you put off the task, you leave thinking you had an unproductive day. Identifying when you've been procrastinated, and when getting out of that state, is very important. I've found that the cause of me getting stuck in this unproductive state (becoming procrastinated) is that I don't really have a clear idea of what I should be doing, and don't realize it. When I know that I need to add a page to our dashboard that pulls specific data from our database and show it to the user, I can execute that in a quick and efficient manner. However, when I know that I need to "report on visits and conversion metrics", I risk becoming procrastinated. In this case, I think I know exactly what to do and so I set out to do it. But I don't actually know exactly what to do, so start out by writing the code to save the visit, then realize I haven't done the database work yet, so I switch to that, and then I remember I need a view for the new database table, etc. I end up not focusing because, it turns out, I don't know what I'm doing. I'm just doing "reporting on visits and conversion metrics". The problem with becoming procrastinated is that it's hard to catch. "Reporting on visits and conversion metrics" is something I've done before, it's not a tough problem, so I feel like I know what I'm doing. I also think I could do this in a few hours. The reality is that I need to decide what "conversion metrics" actually means, create new tables in the database and then the classes to access them, create new code in the controllers to catch all the visits, create new controllers for the new metrics pages, and then ... oh yeah, actually calculate the metrics. There are even more pieces if you're operating at scale. Each of those things I listed, I can do quickly and efficiently. None of them is hard. What is hard is that there are too many parts to "reporting on visits and conversion metrics" to keep in my head, to estimate the work required accurately, or to be productive. Worse, since it's such an easy problem to solve, I treat it like it's easy and so don't break it down in detail -- I just set out to do it. And the worst part is, this leads me to become Procrastinated! When I'm procrastinated, I have a fuzzy idea of what I'm doing, but I think I know exactly what I'm doing. I'm not sure exactly what needs to be done right now, but I don't realize it. Not being able to tell when you've been procrastinated is the worst part. I'm still writing little pieces of code, my test aren't breaking, I'm making progress -- it's just not efficient at all. The reality is that I'm procrastinating on answering the question I must answer which is "exactly what do I need to accomplish to finish 'reporting on visits ...'?" "And in what order do I need to do them?" I'm avoiding detailed planning because that's hard. Reading blogs and news is like watching TV, my brain has basically checked out -- no need for active thought. Just read the headlines, see something shiny, click, skim, repeat. Planning out a task in detail requires your brain to engage and think. What do I do to avoid being procrastinated? I try to track all of my time. Sort of like changing your eating or spending habits, I try to track and review all of my time to look for inefficiencies. What I've found working for me in the last few months is using Pivotal Tracker for everything technical I do, and entering tasks retro-actively. So, when I get sidetracked on fixing some bug I find while implementing my new feature, then I add that task in separately after the fact. This lets me look back and see everything technical I did in a day, week, month, and I can review it. The most important part of using Pivotal Tracker (or otherwise tracking everything you're doing) is that it makes me write down everything I'm going to do, and forces me to take bite-size pieces. I'd never get away with a task in pivotal called "reporting on visits and conversion metrics", it would just look wrong to me. Am I going to have that as a comment on a checkin? No way! It would force me to break that down to its component parts, and suddenly I'd be flying.I suspect that being procrastinated is such a widespread problem, that, over time, it consumes huge percentages of potentially productive time, and that this is where pair-programming came from -- as a means to combat becoming procrastinated. I wonder if it would pay to hire someone an intern whose job it was to sit beside me all day, and, every half hour, just ask "what are you working on?" and "is that the highest priority item right now?"?
Now that I think of it, a lot of the agile methodologies seem to be well-suited to address becoming procrastinated. Lessons to take away:1) Don't get Procrastinated
2) Learn to identify when you are Procrastinated, and identify why
3) Do that thing you're not doing that you identified in 2
4) Experiment with systems to either reduce the chance of you being procrastinated, and/or
5) Experiment with systems that will allow you to more quickly identify when you have become procrastinated (I have so much more work than I can handle, that we're hiring engineers ;). http://careers.thinknear.com/)
When the job title reads: "Software Engineer - Rails focus"; and the job description includes passages such as: "When most people rank themselves from 1-10 on a language, they overestimate by at least 2-5 points -- YOU are legitimately an 8-9 in Ruby, preferably on Rails." You should probably have some sort of Ruby and/or Rails experience somewhere on your resume ... especially when there are other jobs listed by the same company that don't list knowing Ruby as a hard requirement.
We have 4 positions officially open. We're looking only for rockstar engineers. 3 of them don't require Ruby on Rails (just a willingness to learn and a track record of learning new technologies quickly). If you don't know Ruby, but would like to apply, please apply to one of the others. http://careers.thinknear.com
We've officially started hiring here at ThinkNear (http://careers.thinknear.com), and that means for the next month or so I'm 30%-50% technical recruiter/interviewer. As I'm sure anyone who's been hiring knows, getting resumes out of college is easier than out of industry. With that in mind, I've been interviewing a few college grads. Here are some tips they should take to heart, both for when applying to ThinkNear, but also for other companies.
This past winter at TechStars NYC, I worked harder and longer than I ever had before in my life, and I sustained it for weeks on end. We were building a product, new priorities and crises every week, and launching features faster than I ever had before. We were building a business at lightning speed.
ThinkNear is entering a second phase of its growth. This phase won't be like the first phase, we'll be building up from a base we've established (and growing -- http://careers.thinknear.com). In the first phase, we had to build the base as fast as we could, and we had a real sense of urgency -- the end of TechStars and then our runway was just a few months away. In this second phase, there is no more TechStars, and our runway will be more than just a couple months away. We will have the same pressing deadlines with our partners, but we will need to establish our own milestones. More importantly, it's important to find the right balance of work and not work so that we don't burn ourselves out. We need to find a balance between the urgency we need to make phase 2 a great success ensuring there is a phase 3, and being alive and ready to face phase 3. I'm calling whatever this point is, Maximum Sustainable Effort, and I'm trying to find what mine is. I've never had anything push me as hard as this before, never had anything ask me to give everything I had, then ask why there isn't more. This, I'm told, is the thrill of being at a startup. I've established that my Maximum Sustainable Effort is greater than 60 hours per week. Going home at 6pm on a Friday feels like taking half a day off. A quiet morning reading with coffee reenergizes me more than a whole weekend used to. I've also established that it is less than 90. There were a few 100+ hour weeks during TechStars, and I came dangerously close to burning out for more than a few days. I'm starting to think that my MSE exists in the 70's. I'm also starting to believe that MSE is like muscle -- you can train it. Books I've read about training talent agree with me -- if raising your MSE is a talent. You can train yourself to focus on tasks for longer, and you can train your routine to become more efficient. One of my personal motivators has always been to stretch my own potential to realize my full potential. What I'm working on here has been a great motivator, a great training ground, and a great field on which to play. I'm really excited both about the future of ThinkNear and the personal growth I will get to realize by being part of an amazing team going after such an interesting problem in a huge market. What's your Maximum Sustainable Effort? How did you reach it? Tangent Story: We were moving so fast that at one point this past winter, we got an email asking for a feature we didn't have. As soon as this partner of ours asked for it, it was immediately obvious to me that we should have had it, and replying that we did not have it would have been embarrassing and probably ruined the partnership. So I did the only logical thing I could think of: I coded and launched the feature before replying to the email so that I could say we did indeed have it. This took a bit more than a day, so he still got a reply in a timely manner.Posts
When I code, I often have an instance of the Rails console running in another window. Any time I get to a part of my code that I’m not 100% sure on, I just play around with it. It’s especially helpful when calling API’s or dissecting large structures of hashes.
One problem I have, is that I code in multiple projects, and each of those projects have multiple environments on Heroku. Sometimes when troubleshooting I have my local Rails console open, and a couple of consoles of my app running on Heroku (am I getting the same results on Heroku that I am getting locally? how about sandbox vs. prod?). Sometimes it gets worse — our apps talk to each other. So maybe I have even more windows open for different apps.
My solution was to monkey patch IRB to let me drop in my own prompt config that works for Rails. It appears that there are no hooks into IRB to allow a custom commandline, so for right now, monkey patching appears to be the only way to go.
My code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
require 'irb' module IRB class << self alias :orig_init_config :init_config def init_config(ap_path) begin puts "loading init config: #{Rails.env}" # Set up the prompt to be slightly more informative rails_env = Rails.env current_app = Rails.application.class.parent_name # a string we can sub the 2nd to last character depending on the context prompt_string_root = "#{current_app}(#{rails_env})%%3n%c> " # calculate manually since we don't count the trailing '> ' and %100n will be 3 char normal_prompt_string_length = current_app.length + 1 + rails_env.length + 1 + 3 empty_string_root = "#{' ' * normal_prompt_string_length}%c> " # http://tagaholic.me/2009/05/29/exploring-how-to-configure-irb.html#prompt prompt_config = { # Normal prompt (%zn – Line number with optional number z for printf width) :PROMPT_I => sprintf(prompt_string_root, '>'), # indent prompt :PROMPT_N => sprintf(empty_string_root, ' '), # string continue prompt :PROMPT_S => sprintf(empty_string_root, '"'), # statement continue prompt :PROMPT_C => sprintf(empty_string_root, '?'), # prefix output (leaves a space between the side effect ouput and return val) :RETURN => "\n=> %s\n" } #actaully do the init orig_init_config(ap_path) # and override with our preferred prompt @CONF[:PROMPT].reverse_merge!(:RAILS_ENV => prompt_config) @CONF[:PROMPT_MODE] = :RAILS_ENV rescue Exception => e puts "Error loading IRB PROMPT", e end end end end |
Anyone know how to do this without monkey patching?
References:
Putting config into an .irbrc or .railsrc doesn’t work for my use case since I wanted this to work on Heroku (which doesn’t support irbrc, right?) and IRB isn’t defined when Rails/config/initializers run.
While Googling around, I found some more tools I’m excited to try:
Update: I haven't figured out how to recover the elements in order using the Synchronized Priority Queue, so I'm using the TreeSet implementation.
I needed a data structure to store a series of elements in order. They were just numbers, but I wanted to retain all elements (so the traditional definition of a Set wouldn't work, since I don't want to lose duplicates). I decided that collecting them in order and paying the computational price on each insert was probably going to work better for my use case than sorting at the end.
We are using CloudWatch for a couple of custom metrics, but we collect the metrics at a rate beyond what CloudWatch will handle. so I wrote a class on our side to collect stats for us, which we can then periodically publish to CloudWatch as a StatisticSet.
Under our old system, each thread would create their own Datum, and then just add that datum to a shared, synchronized data structure -- just one point of contention across threads, and, what should be, a very fast one without the chance of conflicts.
However, writing our new implementation, we want to count along several dimensions for each thread. This will be a highly contested piece of code -- we serve hundreds of requests per second inside a single host and our latency constraints are very tight. I opted for a solution using several AtomicCounters
I didn't just synchronize the method because I thought that that might introduce bigger (and more costly) points of contention. Having thought on this more, I don't have any confidence in saying that one implementation would perform better than another. Better to test both and figure it out.
Test:
10 threads, each one performing 20 billion runs. Each run increments 2 values, 25% of the time we increment a third counter, 6% of the time we increment a fourth counter, and we track a maximum. This roughly mirrors the logic I'm using in my production implementation, at least the part that's under contention.
Source code is on github: https://github.com/softwaregravy/AtomicVsSynchronized
I ran it a few times just using the unix time command. The results were consistently close with the two samples shown below:
AtomicRecorder
java TestHarness 16.58s user 0.09s system 190% cpu 8.756 total
SynchronizedRecorder
java TestHarness 12.87s user 10.12s system 165% cpu 13.845 total
So, with the results in hand, I feel justified with my decision to use the Atomic values vs the synchronized method. Under heavy contention, which I expect in this particular component, they perform significantly better (overall).
That said, the synchronized version is using less cpu than the atomic version. The synchronized version is also clearly the simpler of the two. This becomes more apparent as you grow the class beyond trivial. And maybe that particular piece of my code wouldn't be a contended as I think it will be?
Another test, this time with zero contention: 1 Thread performing 200 billion runs.
AtomicRecorder
java TestHarness 7.75s user 0.06s system 100% cpu 7.738 total
SynchronizedRecorder
java TestHarness 7.16s user 0.06s system 100% cpu 7.181 total
Synchronized is slightly faster with no contention, which makes sense intuitively.
Overall, it's a tradeoff. Using Atomics is significantly faster, but only under extremely high contention. They also make your code more complex. For 99% of the use cases out there, synchronized is probably the much better choice. For that other 1%, I only suggest making sure you're in the 1% to make sure it's worth it. Otherwise wasting optimization effort.
Subscribe to an SNS notification with your service via HTTP? It sends a confirmation message in a post. I couldn’t find this documented anywhere, so here’s wat I got (a couple of random characters removed from signatures and endpoints):
#json
{
"Type" : "SubscriptionConfirmation",
"MessageId" : "84508b4c-be5d-4ac4-9c1d-96eab2b6fe6e",
"Token" : "2336412f37fb687f5d51e6e241d09c805f2fe718d402ad3291fbbf65d1089c48a1573ff939ca4584a571e9aa496256c6ec9a73bc8e450e4fea07d51d3ed3bf2cd1814e095f19d3671c5566850da313940d0b006a00ccc6226e8f0fa774831c5aabc015eb563f6418c01855f144c1453",
"TopicArn" : "arn:aws:sns:us-east-1:35308450730:ThisIsATest",
"Message" : "You have chosen to subscribe to the topic arn:aws:sns:us-east-1:35308450730:ThisIsATest.\nTo confirm the subscription, visit the SubscribeURL included in this message.",
"SubscribeURL" : "https://sns.us-east-1.amazonaws.com/?Action=ConfirmSubscription&TopicArn=arn:aws:sns:us-east-1:35308450730:ThisIsATest&Token=2336412f37fb687f5d51e6e241d09c805f2fe718d402ad3291fbbf65d1089c48a1573ff939ca4584a571e9aa496256c6ec9a73bc8e450e4fea07d51d3ed3bf2cd1814e095f19d3671c5566850da313940d0b006a00ccc6226e8f0fa774831c5aabc015eb563f6418c01855f144c14535",
"Timestamp" : "2012-01-19T20:13:34.281Z",
"SignatureVersion" : "1",
"Signature" : "bpfXFxjcDPYDxAikOCiYrYHWgcmJKDelkrckTtYaM6IuBZgOcHedP3bxuCONwHVQRGnBMPk6/RT8nOjkX54ntWz3/2Z7YZNprDE1qJUplF0AcVPd2dPYcwy+mbE2qCs6PtqPAJ10Qz475BqFF9nHE07A9MSG8RXHQh1t0GMs=",
"SigningCertURL" : "https://sns.us-east-1.amazonaws.com/SimpleNotificationService-f3ecf7224c7233fe7bb5f59f96de52f.pem"
}
I’ve been working with a lot of AWS lately, and I’ve decided it’s a great background that engineers should have. Not AWS, per se, but just a deep understanding of infrastructure and infrastructure problems. Most graduates of CS programs lack this. To my knowledge, colleges are not offering AWS 101 nor a theoretical counterpart (scalable architecture 101). Maybe they should. Not because AWS is great in itself, but it exposes people to all the complex infrastructure decisions that go on at some level under every organization. Also, if you ‘get’ AWS, the skills will be very transferable to other cloud providers.
If there were an AWS course, I think it would cover the following:
- spin up an EC2 instance (bare Amazon 64-bit AMI), and log into it
- mount EBS volumes (8 1TB drives in RAID0 anyone?)
- install Sun Java 6_latest
- create your own AMI
- build a .war with maven
- install Tomcat on EC2 and run a simple webapp
- customize Tomcat’s server.xml (just make a simple change)
- set up a Mongo replication set on their own AMI
- set up Mongo sharding
- be able to have simple writes from your Tomcat webapp into Mongo
- simulate failures of mongo instances
- take backups (snapshots) of your mongo data
- restore from your backups without losing things
- run elastic load balancer against many instances in EC2
- simulate failure of an availability zone — your app and db’s should continue to run
- run autoscaling (your apps are all your ami and start as a service, right?)
- run a load test against your set up with JMeter, also running in the cloud
- use Elasticache to store the last 10000 read values from Mongo, and rerun the load test
- tune Tomcat, the JVM, and Mongo to improve results
- build and deploy your code with 1 command
Do all of the above from the command-line tools from Amazon. Write your own scripts in Ruby.
Document every step on a blog.
Put all your scripts and stuff in Github.
You now have an amazing resume item for any new grad (and pretty good one for most professionals, too). Total time invested will be less than most University courses, cost is most likely under tuition for 1 course (most of this is free), and you now have a great understanding of a very common web architecture. In fact, if a new or soon to be graduate of a CS program did all this, I would promise an interview or, if you’re still a ways away from graduating, an internship at ThinkNear. (I’m also looking to talk to anyone with experience doing this in industry as well, but this is particularly impressive for someone early in their career.)
Complete this list and get an internship or interview with ThinkNear
Note: Maven, Tomcat, Mongo, and JMeter were selected ‘at random’ because they’re open source and well-known with lots of docs out there and good communities. I think learning any technologies is valuable, so feel free to sub out your favorites: Ant, Rake, Jetty, Rails, DJango, etc etc.
For bonus:
- run inside a VPC
- use Route53 to give all your hosts friendly names
- have new hosts that start up auto-register themselves with Route53 and load balancers to start taking load
- run a load test, and watch your fleet autoscale up to meet demand (you’ve got autoscaling working, right?)
- simulate an availability zone failure, and watch your fleet autoscale up to meet demand
It’s taken a few years, but I am firmly on the Test-Driven Development band wagon. Testing is everywhere in the Ruby community. I can honestly say that RSpec changed the way I work. The way it ties language into tests just makes coding up tests really really useful. They truly are the spec — and you can read them. How about that! I can seriously print out my tests, give them to our business, and just say — that’s what I’ve built.
Now, when I code, I start with the specs. The odd time I code a method first, if I’m figuring something out, but once I’ve got it I comment it out and go back to the specs.
- I organize my specs by method
- make heavy use of contexts and before blocks
- It blocks should be short and sweet, I abuse the commentless variety
Step 0) sanity
- latest code from origin
- pivotal task started
- all specs pass
- guard running
Step 1) specs
describe "#method_name" do context "when the user does not exist" do it "should post to Airbrake" it "should return a 500" it "should render the 'does not exist' page" end end
Step 2) setup
describe "#method_name" do context "when the user does not exist" do let (:user_id) { 500 } it "sanity test" do User.find_by_id(user_id).should be_nil end it "should post to Airbrake" it "should return a 500" it "should render the 'does not exist' page" end end
- use the before block to make your context statement true
- where I assume, I sanity test them prior to other tests
Step 3) Specs
it "should post to Airbrake" do Airbrake.should_receive(:notify) get :method_name, :user_id => user_id end it "should return a 500" do get :method_name, :user_id => user_id response.code.should == '500' end it "should render the 'does not exist' page" do get :method_name, :user_id => user_id response.should render_template("error") end
Step 4) Write the code
Exercise for the reader :)
General principles
- specs before code
- when you have a bug, get a test to make it fail before fixing it
Your specs are an asset. They specify your system, the better your code coverage, the better they specify it. Forcing yourself to write specs first forces you to really think through what you’re doing, see clearly how you will solve the problem before solving it, and, god forbid, write the same mistake down in 2 places. To introduce a bug, you either need to neglect testing, overlook something, or write the same error twice. Testing leaves only overlooking things, which is actually a smaller percentage of bugs, at least for me, than off-by-ones, nil variables, and other errors.
If there were a way to force the diligence that tests put you through — the thought process, the error checking, the edge case consideration, they would be less valuable. They’d still be valuable for the sake of the spec. However, I’ve found no such methodology, procedure, or tool which can force upon me the same discipline of thought that actually writing out hundreds of lines of testing does.
Git is great. After years of being a perforce user, it took a while to adapt, but now I’m very happy with git. Git is powerful, flexible, and generally awesome. The one thing git doesn’t have going for it is that if you ask several developers ‘how do you use git?’, you can get several answers. The good is that git is flexible enough to have ‘good’ workflows for teams from 1 to 100000, the bad news is that those workflows are often different, not well standardized, and it’s easy to bludgeon your way to a mostly working workflow that is badly suited for your needs (or downright wrong).
As our engineering team is growing, I’ve been doing some research on the ‘right’ workflow for us to use going forward. I looked at a number of resources, and have settled on the ‘rebase workflow’. There are some good write-ups out there, but the simplest and best suited to our needs I could find was on Rein Henrichs blog ( http://reinh.com/blog/2009/03/02/a-git-workflow-for-agile-teams.html). The following is how we’ll use git at thinknear, and we borrow heavily from Rein’s model.
Workflow
0) Get a story or bug from Pivotal Tracker. If you need to make a change that’s not in Pivotal, create a story in Pivotal to track your work. * mark a task as started as soon as you start it so others know you’ve started it
1) Get a local, up to date copy of master
git fetch origin git checkout master git merge origin/master
At this point, git diff master..origin/master should be empty.
2) Create a branch for the feature with a relevant name
git checkout -b fixin_broken_stuff
3) Do work on your branch
- check in often, lots of small working commits
- check for/get the latest from origin/master at least once per day
- use rebase when getting the latest
- optionally, and carefully, use origin as a place to backup your work
- a good idea if you’ll be several days away from master
3.5) Update from origin/master using rebase
git fetch origin git rebase origin/master
4) When finished, rebase the work to master
git fetch origin git rebase -i origin/master
- this will open a file editor listing all the new commits from your feature
branch:
-
pickthe first commit -
squashthe rest - save and close the file
-
- another file will open, this will be your new commit message
- Prefix commit messages with Pivotal story id, see https://www.pivotaltracker.com/help/api?version=v3#scm_post_commit_message_sy... gets to github through the hooks here https://www.pivotaltracker.com/help/api?version=v3#github_hooks
- Good commit message practices: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
- resolve conflicts along the way (read the git messages, they tell you the way forward)
5) Merge to master and push to origin
rspec spec # add any additional tests here, point is, everything works before commit to master git checkout master # we just fetched from origin a second ago, right? git merge origin/master git merge fixin_broken_stuff rspec spec git push origin master
6) Mark the tasks in Pivotal as Finished once the code is in and pushed to origin/master
- accepted once they’re in production
- between ‘finished’ and ‘accepted’ is our deployments and verification testing, which will have to be in another post
General principals with this workflow are
- commit early and often while working
- favor larger commits in master over many smaller commits
- 1 branch per developer, push to origin if you want, but the only source of collaboration is through master
- master should pass tests at all times
- master should be ‘pushable-to-prod’ at any time
- code reviews happen retrospectively through comments in github
- Pivotal should contain a record of all work completed and every commit should be somehow related to a pivotal story (see step 0)
Resources:
Git
Git Workflow
- http://www.randyfay.com/node/89
- http://www.randyfay.com/node/91
- http://reinh.com/blog/2009/03/02/a-git-workflow-for-agile-teams.html
- http://stackoverflow.com/questions/3817967/correct-git-workflow-for-shared-fe...
- http://stackoverflow.com/questions/804115/git-rebase-vs-git-merge/804178#804178
- http://stackoverflow.com/questions/457927/git-workflow-and-rebase-vs-merge-qu...
- http://unethicalblogger.com/2010/04/02/a-rebase-based-workflow.html
I’ve been working on making my specs faster for a while ago. Now that my largest project has thousands of specs, they take minutes to run. I’ve found two easy fixes that I’m actively applying.
Avoid Create
A lot of my model specs center around creating objects and testing their
methods in different states. I’ve found generally that 90% of a model can be
tested without create. I now use new heavily.
As an example, I have a fairly simple class called TimePeriod. Does fairly obvious things, like has a duration, and such. One thing we do is compare them. Here’s an exert from it’s spec file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# before describe "<=>" do it "first compares days" do l = TimePeriod.create(:start_day => 1, :start_time => 10.hours.to_i, :duration => 1.hours.to_i) r = TimePeriod.create(:start_day => 2, :start_time => 10.hours.to_i, :duration => 1.hours.to_i) (l <=> r).should == -1 (r <=> l).should == 1 end end # after describe "<=>" do it "first compares days" do l = TimePeriod.new(:start_day => 1, :start_time => 10.hours.to_i, :duration => 1.hours.to_i) r = TimePeriod.new(:start_day => 2, :start_time => 10.hours.to_i, :duration => 1.hours.to_i) (l <=> r).should == -1 (r <=> l).should == 1 end end |
It’s actually a fairly complex class under the hood, and has a lot of edge cases. In total, I have 130 specs for this class, though most are fairly simple. In total, there were 83 instances where I could replace create with new.
In 3 runnings, the 130 specs before the change took 3.61, 3.67, and 3.9 seconds. After making the change, they took 3.08, 3.15, and 3.1 seconds. All times reported by rspec, running inside guard, against spork with pandora running in the background, and multiple browser windows with dozens of tabs each — a fair approximation of my normal development conditions. From my sample size, that’s a 17% improvement.
Avoid Create 2 => Avoiding Factory ===
Turns out this was a very powerful idea that I can apply more generally — there are hidden calls to create all over the place. Lets take a look at my business class. In my project, a business belongs to a user. Now, the user isn’t actually being tested in my business_spec, and there is no behavior of my business that depends on any state in the user, but my business is not valid without a user. Turns out I was creating a user (yeah, in the db) for every spec. My business has 138 specs. On 3 runs, they took 11.01, 10.95, and 10.84 seconds.
So I made the following change to avoid the ‘create’ call implicit in the factory:
1 2 3 4 5 6 7 8 9 10 11 12 |
# before let (:user) { Factory(:user) } # after before :all do @user = Factory(:user) end after :all do @user.destroy end let (:user) { @user } |
After the change, they took 6.03, 6.45, and 6.3 seconds. From this sample, that’s a 43% improvement. Here’s my factory:
1 2 3 4 5 6 |
Factory.define :user do |user| user.sequence(:email) {|n| "1test#{n}@sample.com"} user.password 'secret' user.password_confirmation 'secret' end |
Pretty simple, eh?
Quick Summary
Applying these two principals will give differing results depending on the object under test, but in general, it’s worth being mindful of object creation. The obvious alternative that would have improved speed greatly would have been to abandon the principal of 1 should per spec. To a very large degree, I follow this mantra — combining my 130 specs (many of which are 1 line) into 30 big specs, or even 20 mega specs, would certainly have improved at least as great a speed performance. Even then, these would help avoid some costly creates.
- rails 3.0.10
- rspec 2.6.0
- factory_girl 2.1.2
- spork 0.9.9rc9
- guard 0.6.3
Someone appears to be ahead of me https://github.com/pcreux/rspec-set
I has in a partial for the fields of a form today where f was my form
builder. One of the fields on my model was ‘text’, but I am using the
serialize method, so I actually needed to take in an array of values.
I couldn’t figure out any method of f that could give me access to the
array, so, I took the HTML normally generated by f.text_field and just
added [] to the end of the name in order to pass it as an array
value. But with just the raw HTML, I need to populate the value — but all
I had was a reference to f. Turns out, the object being build by f is in
the object field.
Here you can see a contrived example with a model, the view before I made the modification, and the view that now lets me edit elements in the serialized array.
1 2 3 4 |
# my_value_array :text class MyModel < AciveRecord::Base serialize :my_value_array, Array end |
Since I wasn’t being clever, I just gave myself space to add at most 2 values to the array at a time. However, this meant I needed to clean up empty values in my controller.
1 2 3 4 5 6 7 |
def update @my_model = MyModel.find_by_id(params[:id]) params[:my_model].try(:[], "my_value_array").try(:reject!){|v| v.blank? } if @my_model.update_attributes(params[:my_model]) # blah blah blah end end |
Full disclosure, I boiled the sample code down from a much more complex example. I think it should work, or at least get you on the right track.
I recently built an app with a multistep, implicit registration. By that I mean that there are multiple pages of the registration process, and we create an account for the user implicitly (we just email them their password after we create an account for the). Someone asked me about it, so I created a little app to demonstrate roughly what I implemented in my real app.
The app is online here. The source is available here.
I used Implicit registration Good Idea
I do like the implicit registration bit, but I do not like the multistep registration flow. The implicit registration is just great from a UX perspective. A bit weaker from the security perspective, but still not terrible. The security weakness could be greatly mitigated by requiring a password change on a subsequent log in, or otherwise expiring it after a certain period of time.
1 2 3 4 5 6 7 8 |
def register_user password = User.send(:generate_token, 'encrypted_password').slice(0, 6) user = User.create!(:name => name, :email => email, :age => age, :password => password, :password_confirmation => password) Notifications.signup(user, password).deliver self.user = user self.save! end |
I used a state machine to drive the process. Bad Idea
1 2 3 4 5 6 7 8 9 10 11 |
state_machine do state :email, :exit => lambda {|reg| reg.errors.clear } state :age, :exit => lambda {|reg| reg.errors.clear } state :complete, :enter => :register_user event :next do transitions :from => :email, :to => :age, :guard => :guard_to_age transitions :from => :age, :to => :complete, :guard => :guard_to_complete # do not allow complete => complete end end |
This was a lot more effort than it was worth. I saw something similar used while browsing the source code of spree and I wanted to give it a try. It seems so clean at first, but then you have to start building in one-off cases here and there, and soon your controller is a mess because everything is in update. The update is already a mess in this extremely simple example, it just gets worse.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
def update @registration = Registration.find(params[:id]) redirect_to users_path(@registration.user_id) and return if @registration.state == 'complete' if @registration.update_attributes(params[:registration]) if @registration.next! if @registration.state == 'complete' #can only reach this block on first completion -- or next will have failed sign_in(:user, @registration.user) redirect_to user_show_path and return else redirect_to registration_state_path(@registration, @registration.state) and return end else flash[:alert] = @registration.errors render :template => get_template_for_state(@registration, @registration.state) and return end else flash[:alert] = @registration.errors respond_with(@registration, :location => registration_state_path(@registration, @registration.state)) end end |
I strongly regret not just having a different action for each view, with a different action for each update of the step. Live and learn.
Duplicate Data or Manual Validations or Duplicate Manual Validation
With a multistep registration flow, I realized that you’re left with a tradeoff: you can either get into validations manually or you can duplicate data. If you use a registration object, like I did in the example, I’m collecting data from the user which I will use to build the actual model. The result is that I need to perform any data validations that the User model wants on the data as I collect it. The example from my code is that I need to check on the uniqueness of an email, required by the User, during the registration.
1 2 3 4 5 6 7 |
def require_email errors.add(:email, "Email is required") unless email.present? if email.present? errors.add(:email, "Email is already taken") if email_in_use?(email) end errors.empty? end |
The alternative is to put the data you’re collecting directly into the model as you collect it so the validations happen as you go, but then you probably need to have a ‘not ready’ state/flag on the model. To illustrate, suppose we create a User at the time we create the Registration. However, to be valid, our User requires a name, email, and age. Until the User has all those values, they are not valid. If users have only given us their name and email, but not their age, they should not be able to log in or use the system. This means that you’ll need to account for this everywhere you interact with the User model. You could create a default scope, but then you need to be aware when to undo that.
Between the two options, I lean towards the ‘not ready’ flag. This will scale as you add values, data and validates live in one model, and, if you use default scoping, you should be able to isolate the knowledge of the update flag to the Registration object. However, if you cannot do away with the registration object, then you will have large ugly update methods, because they will have to update both the Registration and the User.
Conclusion
The final conclusion is that I really dislike multistep registration. All the approaches I’ve found have tradeoffs. Having built a complex registration flow once, I’m much stronger for it.
I really do not like Rails Time Zones. More specifically, I do not like solutions and questions that are easily findable by Google. I'm sending an email, and had the following string in the text: "... is only valid until 2011-06-10 03:30:00 UTC". Not what I want to be sending. So how do I convert that? Googling is a pain because most people seem to be handling this in a 'global' way by setting Time.zone. Ryan Bates does it here, http://railscasts.com/episodes/106-time-zones-in-rails-2-1. Here's a typical answer on StackOverflow. They're all just doing a `Time.zone=`.
Why don't I like that approach? Well at first it was setting off my concurrency-spidy sense. After seeing the solution everywhere, I checked the source, and that method is (unintuitively) setting a thread-local variable. All I wanted to do was change the view! I don't want to have to worry if this could impact other parts of my code, even if it was just in that request. (The root cause of this is that I have a date-heavy part of my code which I know is not time zone aware. Yes, I have to fix it.)
Anyway, after hunting through the source, I finally found in_time_zone. Now all you need is the magic string to pass into it. I wanted Eastern, so I needed
Time.now.in_time_zone("Eastern Time (US & Canada)
All magic strings available in ActiveSupport::TimeZone.us_zones
$ ActiveSupport::TimeZone.us_zones
=> [(GMT-10:00) Hawaii, (GMT-09:00) Alaska, (GMT-08:00) Pacific Time (US & Canada), (GMT-07:00) Arizona, (GMT-07:00) Mountain Time (US & Canada), (GMT-06:00) Central Time (US & Canada), (GMT-05:00) Eastern Time (US & Canada), (GMT-05:00) Indiana (East)]
Great learning experience for me.
Today I spent much of the day trying to get a migration to run in Rails. It's on a table with hundreds of thousands of rows and a couple hundred MB (not really that big, IMHO). The job consistently failed with a 'failed to allocate memory' error at approximately row 150 000.
The migration is a logical one -- we've added a new column, and this migration is a backfill. We have some business logic around what we want in this column, so it's something I want ruby code to set rather than recreate our business logic in SQL. I spent a lot of time with find_each, trying to wrap the jobs, pulling the logic into its own rake job, and anything else I could find. No go.
In the end, I can get 100 000 rows to migrate (back-fill), so here we are running a script locally to migrate our table 100K rows at a time. In case anyone wanted to view the horribleness that is how we're doing our backfill, it's included below. What's going wrong in for_each that it's not able to operate over more than a 150K rows? Since I'm saving back during each block, is that causing issues? Here's the (scrubbed) code
namespace :logical_migration do
task :task_name, [:from_index, :to_index] => :environment do |t, args|
last_min = args.from_index.to_i
(args.from_indx.to_i..args.to_index.to_i).step(1000).each do |this_max|
Model.find(:all, :conditions => ["id >= ? and id < ?", last_min, this_max]).each do |m|
m.make_our_change
m.save!
puts m.id.to_s if m.id % 500 == 0
last_min = this_max
end
end
end
end
The puts there is because this job runs for quite a while, and I like feedback so I don't panic and kill it.
This one caught me totally off-guard. My error definitely comes from ignorance about how the built-in Rails cache works. The built in Rails cache, reachable with `Rails.cache` is file-based by default. Now that I'm at this juncture, it was wrong of me to have assumed anything. It clearly names the default in the Rails Guides. So, if you're doing `Rails.cache.fetch` anywhere in your code-base, you've got to be aware that this causes causes caching between runs of rspec. This means that, say, were you trying to simulate a good result on one run and an error case on another, that you'd have trouble reproducing the error-case. I've written up an example to illustrate. The change is available on github.
UPDATE: So, I've decided that there's something up since there's a flag set in the env test.rb file that should have disabled caching for tests. I'm in the process of following up with rspec-users and, if needed, a rails group.
UPDATE 2: The flag in test.rb is for the controller level caching. I do a Rails.cache.clear befor every test now.
How often do you write code like this?
if x
x.do_something_awesome
end
Or maybe you do stuff like:
if my_hash && my_hash["key"]
# do something aweseome
end
The point is, I very routinely use conditional statements that rely on nil evaluating to false. This is all fantastic, and very convenient, but it’s important to note that nil != false.
ruby-1.9.2-p180 :055 > nil == false
=> false
Lets do an example:
ruby-1.9.2-p180 :069 > nothing = nil
=> nil
ruby-1.9.2-p180 :070 > if nothing
ruby-1.9.2-p180 :071?> puts "turns out we have something out of nothing"
ruby-1.9.2-p180 :072?> else
ruby-1.9.2-p180 :073 > puts "sure enough, nothing is nothing"
ruby-1.9.2-p180 :074?> end
"sure enough, nothing is nothing"
=> nil
Hopefully this was the result you were expecting. Now, lets add a very slight twist. Lets try to use our nothing in other logical operations.
ruby-1.9.2-p180 :106 > new_nothing = true && nil
=> nil
We still have nil. To me, I would have expected this operation to return false, but it still evaluates to false, so maybe we’re good.
Lets look at an example where using nil as if it were false can get us into trouble:
ruby-1.9.2-p180 :107 > {"my_param_1" => false, "my_param_2" => nothing}
=> {"my_param_1"=>false, "my_param_2"=>nil}
ruby-1.9.2-p180 :108 > {"my_param_1" => false, "my_param_2" => nothing}.to_json
=> "{\"my_param_1\":false,\"my_param_2\":null}"
This seems like a contrived example, but if your controllers support json or xml responses, it might be more of a danger than you think. The worse part is if you have meta data, and you want to do something like:
ruby-1.9.2-p180 :109 > {"my_param_3" => true && nothing}.to_json
=> "{\"my_param_3\":null}"
Now hopefully we see the danger. The way around this is to force an evaluation to false. The easiest way that I know of is to use the double-bang operator (which is really just the bang operator twice): !!
ruby-1.9.2-p180 :110 > {"my_param_3" => !!(true && nothing)}.to_json
=> "{\"my_param_3\":false}"
ruby-1.9.2-p180 :111 > {"my_param_3" => !!(true && true)}.to_json
=> "{\"my_param_3\":true}"
Now we can rest easily.
Also, this is a good lesson on why testing trivial things from your models and controllers is often not a waste of time.
A slightly more concrete example? Okay. How about sharks with lasers attached to their head? … well, couldn’t find any of those, so:
class SeaBass
attr_accessible :mutated, :ill_tempered, :laser_equipped
def ready_to_impersonate_shark?
self.mutated && self.ill_tempered && self.laser_equipped
end
end
Now lets say we were doing an inventory of our SeaBass from our web console, and want to have a simple 2 column view:
| SeaBass ID | Ready to Impersonate Shark |
|---|---|
| 1 | true |
| 2 | false |
| 3 | true |
| 4 | false |
You might be tempted to write something like:
<table>
<tr><th>SeaBass ID</th><th>Ready to Impersonate Shark</th></tr><% @seabass.each do |seabass| %><tr><td><%= @seabass.id %></td><td><%= seabass.ready_to_impersonate_sharks? %></td></tr><% end %></table>
This would leave you with
| SeaBass ID | Ready to Impersonate Shark |
|---|---|
| 1 | true |
| 2 | |
| 3 | true |
| 4 |
So it doesn’t look perfect, so now you’re in the fire. What a great place where the !! operator could have been used.
Firefox 5 Looks Great, but ... I use a tonne of plugins. Which ones work with Firefox 5? Is there a way to know how horribly broken I will be if I upgrade, before I upgrade? Maybe I'll just wait for 6 to come out before upgrading to 5.
It's been a little over two weeks since I upgraded my MacBook Pro to 8 GBs of RAM and a SSD. I have been extremely pleased.
The Details
I have a 15" 2.66 Core 2 Duo MacBook Pro from 2009. It's detailed specs are here.
I first went to a Fry's store nearby to buy the memory and hard drive. I went under-prepared thinking the technicians there would be knowledgeable and helpful. They were not. Returned everything, and tried again.
My second attempt a lot better, and I recommend following in my footsteps. I went through Other World Computing, and I bought:
- 1 of 115GB Mercury EXTREME Pro SSD 2.5" Serial-ATA 9.5mm Solid State Drive
- 1 of DIY KIT: 115GB Mercury EXTREME Pro SSD +OWC USB 2.0 Express 2.5" Enclosure Kit
- 1 of 8.0GB (4.0GB + 4.0GB Kit) PC-8500 DDR3 kit
Step 1: Install Carbon Copy Cloner. What a GREAT product! Seriously! Step 2: If needed, reduce the contents of your primary hard drive to under 100 GB. I did this by moving movies onto my external drive. Step 3: Hook up one of your new SSDs into the external USB enclosure that came in the kit, and format it. (Use the built in Disk Utilities program: Applications > Utilities > Disk Utilities. Here's a how-to. Step 4: Clone your hard drive to the external drive with Carbon Copy. Step 4.5: At this point, you can try out booting from your newly cloned HD to make sure it works before you replace your existing drive. (Hold the option key as you restart to get the choice of which to boot from. Took me a few tries to get it. Apple's article.) Step 5: Shut down your MacBook. Step 6: Take off the bottom. (The screws were really really tight for me. Took a while to get them out. Keep a note of where they go, 3 are significantly bigger.) Step 7: Swap out the RAM. Here's a video of someone doing that. Step 8: Swap out your HD for the new SSD that you just cloned to. Step 9: Re-assemble and power on.
Potential problem: * it doesn't turn on, it just beeps. This happened to me on my first attempt. The cause was that the memory the guy at Fry's sold me wasn't compatible with my MacBook.
Now, why did I buy the second SSD? Because I heard that SSD's fail like mad. So I backup my HD every night using a scheduled Carbon Copy job. If anything every happens, I'll just be able to swap out my SSD.
The Results
First off, I am a power user. On a regular day, I'm writing software, using numerous terminal windows, I often have multiple servers running, and maybe there's also a database in there. I also have both Chrome and Firefox open, each with a bagillion tabs open. On top of that, I'm usually listening to music, reviewing something in Preview, skyping, gchatting, etc etc. Secondly, this is totally anecdotal and non-scientific. Under my old setup, startup was slow. Every couple of days I'd need to reboot, and Firefox would routinely spiral out of control and need to be restarted. That is to say, my computer would just get really sluggish. Firefox would show the worst of the symptoms, but it wasn't just Firefox. Restarting Firefox buys some time, restarting the computer is 'the cure'. In my current setup, startup is almost instant. That is to say, from logging in to ready to go with browsers open is super fast. I also went almost 2 weeks without needing to reboot. Today was the first time I needed to shut down due to sluggishness. I was hoping to notice increased speed in my tests, but I do not. However, I have noticed that I am able to run 'significantly' more servers. In my old setup, I remember looking through terminals to find a Rails process that was still running because I suspected it of causing other activities to slow down. I have not done that yet -- usually I'm running out of open ports on which to start listening. Anyway, bottom line is: was it money well spent? and Would I do it again? Yes and Yes. I really like this new setup, and I wish I had made the upgrade sooner. This journey inspired by Coding Horror.The answer is: a free mug. Yesterday, The Daily WTF had mugs, and then ran out. I was bummed. I need a new coffee mug, I've been looking around online. I was all set to one with a vi reference mug, but I have a $5 gift card with ThinkGeek, and they've been out of stock for a while now. To today I see that Microsoft is offering a free Daily WTF mug to sign up for a no-obligation trial of Azure. And you know what? I plan on at least digging through the docs and spinning up an instance or two to see how it works. Incredible the power of a promotion specifically tied to a site I visit regularly can have. For ~15$ per relevant person, it's probably money well spent. If they could tie the promotion to actually spinning up an instance, then they could probably offer bigger incentives. I think they should work on that so that I can get a Hacker News Hoodie.
The other day I tweeted that Paypal is now threatening to cut me off if I don't agree to accept any statements they want to send me by email. No, I don't think that Paypal has ever mailed me anything, but there it is nonetheless. Today, I was on Netflix to reactivate my account, and chose to pay by Paypal. I've done this before, and I've noted then that Paypal tries to get you to use direct debit from banks over other forms of payment. There'$ an obviou$ reason for this: direct debit is significantly cheaper for Paypal than processing credit cards is. An obvious way for Paypal to boost margins is to drive more traffic to direct debit and away from credit cards. Anyway, as of today, I cannot choose my payment method. I can simply agree to pay by Paypal, and they will choose my payment method for me each month as my subscription fee is due. Here's what that looks like. Note that the "Payment Method" is simply "Available Funding Sources" with no options there. (I clicked the other links on the page, none would allow me to change the method.) So I checked out their policies. The relevant section of the policies are listed below. Suffice it to say, that Paypal will default to using the payment methods most favorable to them. Specifically, if you have a balance, then you must use that balance, then onto Paypal-affiliated payment methods, and so on. I do not want to have my Netflix subscription randomly withdrawing from money someone else has sent me through paypal, withdrawing from my bank, or charging my credit card, depending my respective balances. (Also, were I running close to zero in my bank account, I'm sure my bank would be happy to approve an $8 auto-debit reduction and then charge me a $30 overdraft fee. Any reporterlitigious lawyer want to dig for a conspiracy here?) The result of this might be nothing. I'm picky and whiny by nature, and I like to do things my way. However, what I see is the dominant player in an industry restricting the flexibility of its products, reducing the satisfaction they give to their customers, and lowering the value of their products. The last point is because I didn't sign up for Netflix as a result -- I probably will later, but my wallet is somewhere in the other room and I'm THAT lazy. Netflix will undoubtedly see me as a statistic of abandonment in the Paypal checkout pipeline, but maybe others are uncomfortable giving Paypal that kind of leeway, too. Either way, every day longer it takes me to go sign up for Netflix is lost revenue which can be attributed to a Paypal product. Paypal looks very vulnerable to me. They had a very innovative service 10 years ago, and it hasn't materially changed much. They had a big lead, and are milking it quarter by quarter, but they seem to be really lacking in innovation. There are a tonne of start ups getting traction in the payment space *cough*Square*cough* while Paypal's bread and butter eBay play is eroding (eBay's "growing" at 13% vs. Amazon's 84%). So what's Paypal's strategy to defend and extend their lead? Make their customer experience worse in order to improve margins! Will Paypal roll over and die tomorrow, certainly not; but it's no growth play. It's going to do exactly what Microsoft did 10 years ago, and just go sideways (exactly as it has been doing for 5 years now). It will come out with some interesting tidbits here and there, but just keep relying on it's old trick. Square (or someone else) is going to be the Google or Apple to Paypal's Microsoft. So ... going to go short on eBay. I figure I can't lose -- the stock is certainly not going up.
Default Payment Methods PayPal will fund your transaction from your payment sources on file with PayPal in this order, unless you make a change as described below: PayPal Balance Instant transfer from your bank account (if eligible) PayPal Credit (Bill Me Later, PayPal Extras Card, or PayPal Smart Connect) Debit card Credit card eCheck (a delayed transfer from your bank account - may result in significantly slower shipping by seller) Changing the Payment Method You may change the Payment Method at the time you make a payment by clicking the 'More Options'/'Change' link on the Confirm Your Payment/Review Your Information page and then selecting a payment method on the "More Funding Options" page. You may do this each time you make a payment if you do not have a Balance. If you have a Balance, you must use your entire Balance before you can change the payment method. You cannot select a payment method for all future transactions, except that if you have been approved for PayPal Credit you may select PayPal Credit as your preferred payment method. You may do so by logging in to your Account, selecting “Profile”, selecting your PayPal Credit product, and then setting it as your preferred funding source. Payment Methods may be limited for a transaction, including if you make a PayPal payment through certain third party websites or applications. For Business Payments, you are limited to funding your PayPal payment with either (or both) your Balance or by eCheck.
Someone was asking about how to write a compiler. An excerpt of an answer was "Go to college, specialize in software engineering." LOL http://programmers.stackexchange.com/questions/84278/how-do-i-create-my-own-p...
... of open source projects that is https://github.com/MrMEEE/bumblebee/commit/a047be85247755cdbe0acce6#diff-1
Posts
1
|
elastic-beanstalk-describe-application-versions -a APPLICATION_NAME | sed '1,2d' | sed '/CURR_APP_VERSION/d' | cut -d '|' -f 6 | xargs -L 1 elastic-beanstalk-delete-application-version -a APPLICATION_NAME -l |
plutil -lint filename.plist
I deploy to AWS Elastic Beanstalk via API using scripted/templated config files.
I had a problem with a UniformInterfaceException today. Took me a while to track down. The problem is that I have nested Java objects in my response object getting serialized, and I had deleted the default constructor of one of them. Not exactly sure what Jersey needs with the default constructor to serialize (makes total sense for deserialization), but that's what caused the error. Was particularly nefarious to track down because everything in my code worked great, but my client was getting 500's. This was because the error happens on serialization which happens after my code returns.
Forgot to make your EBS drive permanent?
I can't believe this page exists and I'm just finding out about it.
Quite possibly the best tool of all time when working with AWS.
For search and replace, to get it to be interactive (prompt at each occurrence whether to make the substitution or not), use c
%s/old/new/g
I feel like I matured some time in the last few years and never realized it. I've been working full time with Ruby for more than a year now, and I just learned how to create global variables -- prefix them with $. So $var is accessible from everywhere. I have not used them yet, and I don't intend to start, still, TIL.
When I’m coding up a feature, and I accept options to be passed in a hash, I often take a shortcut. Rather than check if the key is there, and if the key has a value, like so:
1 2 3 4 5 6 |
# The way I basically always check def my_method(options = {}) if options[:key] # do something using the option here end end |
I would go out on a limb and say that this (or something very close) is the proper way to check this:
1 2 3 4 5 6 |
# Probably the proper way to check a hash for options def my_method(options = {}) if options.has_key?(:key) && !options[:key].nil? # do something using the option here end end |
Today, the shortcut (or my poor option-naming skills) cost me a bit of time. I had an option, which was either true or false. The problem is that the first form of my option checking won’t execute the option is set to false. Even though I want to take action whenever the option in question is set (to true or false). Here’s what I really wanted my logic to be:
1 2 3 4 5 6 7 8 9 10 |
# What I really wanted my logic to be def my_method(options = {}) if options.has_key?(:key) && !options[:key].nil? if options[:key] # do something using the option here else # do something else end end end |
I’ve decided my true error was in the way I named my option. The shortcuts work if you follow convention, and the amount of effort to force the rest of the program to conform to my first instinct in naming is big.
Something to watch out for.
It's the carat symbol ^
http://gilesbowkett.blogspot.com/2007/05/means-xor-in-ruby.html
http://www.sqlite.org/lang_expr.html
http://www.postgresql.org/docs/8.2/static/functions-comparison.html
https://github.com/rspec/rspec-core/blob/master/lib/rspec/core/hooks.rb#L146
Ever have multiple callbacks? Ever have them depend on each other. I have, sometimes without realizing it. You can put them in order, but it’s not obvious to someone coming along that there’s any dependency, and it could be a nasty bug to track down. Here’s an example of an order dependent callback.
1 2 3 4 5 6 7 8 9 10 11 12 |
class MyModel < ActiveRecord::Base before_create :action_1 before_create :action_2 # might depend on action_1 def action_1 self.mydata = "default" end def action_2 self.mycomplexdata = default + "more data" end end |
I used to put the two methods inside another method, but that required another method, often with a much less descriptive name than the individual actions to be taken, and required you to navigate the file looking for callback definitions. A much better way, IMHO, is to use lambda.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
class MyModel < ActiveRecord::Base before_create lambda { action_1 action_2 } def action_1 self.mydata = "default" end def action_2 self.mycomplexdata = default + "more data" end end |
Clean callbacks at the top of the file, and order is less likely to be broken by ‘standard’ refactoring.
I’m developing a feature tonight in which we have a new email that we’re going to start sending. I’m developing on my own branch. 1) I like just linking to images rather than shipping them as payload 2) I want to test what my email looks like fully rendered 3) The images aren’t in my master branch of my code (and thus, not deployed)
Thanks to Jason Rudolph for this one http://jasonrudolph.com/blog/2009/02/25/git-tip-how-to-merge-specific-files-f...
Now I just git checkout my images into master ahead of my features. They can go to production ahead of my changes going to master. When I’m ready, I can merge and deploy. I haven’t tested this, but since I’m doing this through git, I would fully expect that, should I choose to change or delete these images, when I eventually merge my branch to master everything will get cleaned up.
Joy
array.should =~ another_array
Update: mmm, I've just run another test with this and have == returning true, but =~ returning false.
Latest checkin
-
@Daphne's Greek Cafe (9516 Culver Blvd)8 months ago in Culver City, CA
Badges
Checkin history
-
@Daphne's Greek Cafe (9516 Culver Blvd)8 months ago
-
@ThinkNear HQ (3710 S Robertson Blvd)8 months ago
-
@Chipotle Mexican Grill (9512 Culver Blvd.)8 months ago
-
@ThinkNear HQ (3710 S Robertson Blvd)8 months ago
-
@Daphne's Greek Cafe (9516 Culver Blvd)9 months ago
-
@ThinkNear HQ (3710 S Robertson Blvd)9 months ago
-
@Tender Greens (9523 Culver Blvd)9 months ago
-
@ThinkNear HQ (3710 S Robertson Blvd)9 months ago
-
@ThinkNear HQ (3710 S Robertson Blvd)9 months ago
-
@Trader Joe's (9290 Culver Blvd.)9 months ago
-
@Tender Greens (9523 Culver Blvd)9 months ago
-
@Father's Office (3229 Helms Ave)9 months ago
-
@Daphne's Greek Cafe (9516 Culver Blvd)9 months ago
-
@ThinkNear HQ (3710 S Robertson Blvd)9 months ago
-
@ThinkNear HQ (3710 S Robertson Blvd)9 months ago
-
@Tender Greens (9523 Culver Blvd)9 months ago
-
@ThinkNear HQ (3710 S Robertson Blvd)9 months ago
-
@ALDO (1450 3rd St. Promenade)9 months ago
-
@Barney's Beanery (1351 3rd St Promenade)9 months ago
-
@Old Navy (1232 3rd St Promenade)9 months ago
Posts
ms the body through its role in diabetes, obesity and fatty liver, this study is the first to uncover how the sweetener influences the brain. Sources of fructose in the Western diet include cane sugar (sucrose) and high-fructose corn syrup, an inexpensive liquid sweetener. The syrup is widely added to processed foods, including soft drinks, condiments, applesauce and baby food. The average American consumes roughly 47 pounds of cane sugar and 35 pounds of high-fructose corn syrup per year, according to the U.S. Department of Agriculture.
ontinues to get stronger every year? It doesn’t have to be that way. I was wearing progressively stronger lenses for my nearsightedness until ten years ago I accidentally stumbled upon a method that allowed me to acheive 20/20 vision and throw away my glasses within a year. For the past decade I have not worn glasses or contacts, but I am able to drive, read, and see everything clearly and sharply. The secret was learning how to actually change my eyes so that they could focus clearly on any objects — near or far, without wearing glasses. The method I used is one of the best examples of the self-strengthening technique called Hormetism, the focus of my blog, which I’ve ap