skip to main | skip to sidebar
Showing posts with label Interview. Show all posts
Showing posts with label Interview. Show all posts

Thursday, February 16, 2012

The Art of R: interview and mini-review


The Art of R Programming is an approachable guide to the R programming language. While tutorial in nature, it should also serve as a reference.
Author Norman Matloff comes from an academic background, and this shows through in the text. His writing is formal, well organized, and tends toward a pedagogical style. This is not a breezy, conversational book.
Matloff approaches R from a programmer's perspective, rather than a statistician's. This approach shows through in several of the chapters: Ch 9, Object-Oriented Programming; Ch 13, debugging; Ch 14, Performance Enhancement; Ch 15, Interfacing R to other languages; and Ch 16, Parallel R. I do wish he had spoken to using R with Ruby as well as C/C++ and Python. I also would have liked to see a chapter on Functional Programming with R, especially after the teaser in the Introduction.
I asked Norm and an R using friend if they could help me get my head around things a little better, and the following mini-interview is the result.

Almost every language has some kind of math support. Why bother with R? Where does it fit in a programmer's toolkit?
Norm: It's crucial to have matrix support, not necessarily in terms of linear algebra operations but at least having good matrix subsetting capability. MATLAB and the Python extension NumPy have this, but I'm not sure how far they go with it. And since MATLAB is not a free product (in fact very expensive) I'm summarily excluding it anyway. :-)
Second, R has a very rich graphics capability, which really sets it apart from the others. You can see some nice examples (with the underlying R code) in The R Graph Gallery.
Third, R is "statistically correct." It was created by top professional statisticians in industry and academia.
Russel: As something of a polyglot, I find that each language comes with something of an attitude of how problems should be approached. The grammatical structure and keyword vocabulary of each language drives a way of thinking about problems, as well as what sorts of libraries must be created to cover what may be base structures and functions in other languages. R has a particularly rich data representation vocabulary which lends itself very nicely to a data-centric problem solving mindset. While many more general-purpose languages can, with appropriate libraries, deal well with data, R reduces the cognitive load required for working with multidimensional data sets. In my (relatively limited) work with R, I've come to think of R as a domain-specific language that happens to have some general-purpose functionality, while other languages such as Ruby, Python, Perl, etc., are general-purpose languages with many domain-specific libraries.
I really feel drawn to the idea that languages drive approaches to problem solving. It reminds me of the ##PragProg idea of a language of the year. With that in mind, what do you think a dynamic language (Perl, Python, Ruby, etc.) programmer going to find new and different in R? What about a programmer coming from a system programming language (C, C++, etc.)?
Russel There is much in R which is from the "dynamic language" camp you mentioned: dynamically typed variables, an interactive shell, dynamically loaded libraries, etc. These will be pretty quickly noticeable to a C/C++/Java/C# programmer.
The structure and forced-forethought enforced by those languages are part of their value proposition: they force programmers into design paradigms and ways of thinking that scale up well, while dynamic languages, with their looser syntax rules, do not enforce that sort of engineering discipline on the programmer. For highly organized people who think in very structured ways, dynamic languages are "freeing", while less structured thinking programmers can find that the lack of enforced structure puts a lot of onus upon them to be disciplined in their coding as program sizes get larger. For example, a simple flat namespace is great for a small program with a few dozen lines, but namespacing becomes much more important as your programs come to the thousands of lines and dozens of individual functions or components -- especially as programs become the shared workspace of multiple programmers.
I personally use R as a dynamic language, most of the time not even writing programs in it so much as using it in interpreted mode for data analysis and "analysis prototyping." In that sense, R does for data analysis what dynamic languages do for task automation: it allows you to easily play with scenarios and prototype your thinking about data quickly and easily. You can then codify the best of those techniques into a small (or large) program that can automate that work for various data sets.
Similarly, R has a very powerful and interactive help system. Most packages not only have a quickly available set of API and help documents, but sample data sets built right into the library. From a command line, R users can get examples of how to use almost any library, with sample data included specifically for that particular library.
R has some inconsistencies from its history that can make it feel more "old school" in some ways. For example,there are two object models and the older (S3-style) object model is widely used in older libraries. However, it's nowhere near as "bolted-onto" as languages like Perl or C. R has an extremely rich set of libraries easily available via CRAN (a la CPAN), but the flip side of this wealth is that these libraries work in many ways, expecting data in various formats, etc. Again, it's not as spotty as CPAN or the Python Cheese Shop, or even Pear—most packages are quite good— but it can leave some beginners feeling a little lost when they want to accomplish a certain task. That's pretty common in the open source world, of course, but can be an issue.
R's rich first-class data types build a foundation that is nicely added to by the various libraries and simple interactive shell. Enough libraries are written in native code that performance is generally top notch. For my part, I almost always find that the available libraries far exceed my generally limited statistical needs, so I rarely find myself needing to rewrite some particular statistical code. I'm not a statistician, so I find it quite valuable to not have to worry about that aspect of the work I'm doing in any given project. Additionally, the rich libraries generally spur me on to doing a richer analysis of the data than I would if I did not have such a fully-featured tool available.
Norm, in the Introduction of your book, you talk about R as a functional language. I wish there had been a chapter on this. Can you give some examples of what you mean? Russel, do you have any thoughts about R as an FP language?
Russel: Many languages have recognized the value of functional constructs and added at least simple implementations of lambda and map functions, first-class functions and the like . FP is generally considered to be more easily parallelized, and should thus scale better on modern multi-core and CUDA-like systems. This will be quite advantageous in large data processing jobs.
Norm: Every operation in R is a function. For instance, the operation y = x[5]is really the function call y = "["(x,5) Same for + and so on.
This is brought up throughout the book, starting with the vector chapter.
The biggest implication of this, in my opinion, is in performance. One can often speed up a computation by a factor in the hundreds by exploiting the FP nature of R.
What are some of the things you've done with R that show off it's power and/or niche?
Russel R works beautifully for many types of data analysis problems. I recently used R to generate annotated graphs of Bayesian content filter scorings against timestamps, with lowess smooth and regression line and other enhancements, all built into the graphs without additional effort. This was done for all permutations of the 5 variables used in the study which had tens of thousands of data points. I was using this as a script because of my need to regenerate the graphs repeatedly, but before I'd codified that process, I used R in a "tweak and go" sort of way, as R lends itself well to ad hoc data exploration. Adding and removing data attributes, filtering data, generating data models, regressions, etc., are all easy to do in an on-the-fly manner.
Norm: A fun application I've done is R code to analyze the differences and similarities between the various dialects of Chinese. It can be used as a learning aid for those who know one Chinese dialect but not another. This is an example in my book, in the chapter on data frames.

If you're interested in adding R to your arsenal of programming tools, this is a great way to get started.
Truth in posting—No Starch Press sent me a free copy of this book to review.
Posted by gnupate 0 comments
Labels: , ,

Thursday, March 31, 2011

Protocol Buffers - BJ Neilsen's Take

BJ Neilsen (@localshred or at github) is a member of my local Ruby Brigade, and he's hacking with/on Protocol Buffers with Ruby — oh, and he's a fan of Real Salt Lake too.

He works for a Provo, Utah based startup MoneyDesktop. Where he helped them transition away from a less-than-desirable PHP solution to Rails. They now enjoy an entirely new service-architecture driven by Ruby (and Protobuf). When not working with Ruby, he runs OneSimpleGoal and plays around with iOS and Objective-C.

To get another take on Protocol Buffers, I asked BJ to join me for a quick interview. Enjoy!


How did you get started using Protocol Buffers?

BJ: At the beginning of 2010 I was hired by a startup in Provo to help build out their product offering. The entire application was written in Java, but for the piece I was to be in charge of I was given free reign to choose a platform. Of course I chose Ruby, but it soon became apparent that we needed a solid way to get data from one application to the other.

This need launched a refactor to a more service-oriented approach. Different solutions were researched for dealing with data interchange such as Thrift and the like, but we ended up choosing Protobuf for its simplicity, pedigree, and multi-platform support. No XML, no WSDL, just simple definitions compiled to the language of your choice. Defining a Data Structure and API with one declarative language, and then being able to build the client and server implementations in two different languages was a huge win. We created a Socket-based RPC server on the Java side, and called the endpoints from Ruby. It was very simple.

I'm now with a new company and the new team was very receptive to the idea of a Protobuf Service ecosystem for our service-oriented application. It is currently the primary method of internal data interchange between multiple service applications. At the time of writing, we have over 20 different proto definition files, 63 separate defined data types (including Enums), 15 independent service classes implementing a total of 32 service endpoints.

What do you see as the strengths of the Protocol Buffers data format?

BJ: One of the greatest strengths of Protobuf is its clear data definitions. Open up any .proto file and it's not hard to deduce the structure of the represented Data Types. Defining Service endpoints is similarly simple, meaning all of the ambiguity of WIki-based (or similar) API documentation is immediately eliminated. Clarity is such a key when building a large system with a team of any size. Being able to clearly understand how and what data is transferred within the system is absolutely key, especially when you hire beyond your core development team and need to get people contributing quickly.

I've already mentioned the power we gained from being able to tie together a Service architecture with multiple languages in a unified API. The Protobuf project officially supports Java, C++, and Python implementations for the definitions compiler and data serialization code, but they have a ton of third party code listed for many other languages like Objective-C and JavaScript (with support in Node.js as well).

Which Protocol Buffers implementation are you using? How did you end up choosing it?

BJ: The only Ruby project listed on Protobuf's "Third Party" page (at the time) was Mack's Ruby-Protobuf. This was a great start as the compiler was built in YACC. However, once I started integrating the API into our Ruby application, it became clear that the RPC side had been half-baked and just sort of thrown out into the wild. Files were compiled and stubbed in the wrong places, meaning that if I added any code to the stubbed client or server files, subsequent compiles would overwrite my changes. Not good.

By that time we were full-steam ahead on the Protobuf implementation in the other services, so I basically had to go in and rewrite the compiler code generation for each of the services, as well as a complete rewrite the entire RPC backend to become compatible with the Protobuf SocketRPC library written for Java. Since that first rewrite at the early part of 2010, I've since done another rewrite (late 2010) to use EventMachine as the RPC backend and I can tell you its lightyears faster, and the DSL is much sexier also, looking much more like an AJAX request with callbacks than a standard socket connection with byte-reading hell. You can get that code on my github fork on the compatability-0.4.0 branch.

What are your plans for you fork of Mack's ruby-protobuf? Will it get wrapped into his distribution or will you go all the way, rename it, and start publishing it as a gem?

BJ: Fantastic question. Currently I've packaged the gem internally for our SOA ecosystem to get around the problem of getting it into a full release with the original code. I've embarked in merge-hell attempting to get my code to work with theirs several times now and each time it just feels like it's not worth it. I've yet to have contact with the original developers (I'm fairly sure they live in Japan) and so I'm not entirely sure they'd accept any patches I'd send anyways.

I've also toyed with the idea that since I've changed a significant chunk of the original code I could just make it my own gem with some witty name (and a reference to the original). The only thing that has kept me from that path is that a) I'd prefer not to insult the original developers, and b) I'm a bit ashamed that there aren't very many tests backing up the RPC backend (the major piece that I wrote from scratch).

Each day we have thousands of successful RPC calls with a virtually non-existent error rate running through the EventMachine RPC code written into this gem, so it has certainly been battle tested in a heavily used production system. Unfortunately it just doesn't have that warm fuzzy feeling (for those who haven't used it yet) that you get when you have 200 green tests behind each class. However, patches with tests are certainly welcome :).

Anyone can pull from my fork on the compatibility-0.4.0 branch (essentially my "master" I build the gem from) and build their own gem if they wish. The current version in my fork is 0.4.0.8. I'd be happy to provide any answers to questions that may arise, and I may even be available to consult with anyone on how to implement Protobuf into your current system.

You gave a presentation on Protocol Buffers at uv.rb. How was it received? Do you see more people starting to use this data format?

BJ: To be honest, I'm not sure my presentation went the way I'd hoped, certainly not well enough to highlight many of the benefits and reasons for using Probotuf. I spent too much time showing the "How" instead of the "Why". I think many people left the meeting intrigued but it was also marred by a drawn-out rant by a few of the developers that were present, debating whether or not it was more prudent to use REST/JSON than a more declarative format like Protobuf.

The argument is moot simply because both styles are great, they just fulfill slightly different needs. When it comes to "Code as Documentation" its hard to argue against Protobuf, a format that is much easier for devs from other languages to buy into. I've never had a developer come to work on a Protobuf API who, after being shown the .proto files, could not understand how to read or extend the definitions.

I hope that developers will give the format a try because I think it's the next level up from normal web application design. It's the start of understanding that for larger applications, different tools should be considered to help alleviate the pains of a (potentially) larger system and the needs of moving data from one place to another on the fly.

Ok, that's a pretty intriguing statement. What different tools should we be looking at (or developing) to work on larger systems and larger data sets?

BJ: Hopefully I don't get myself into too much hot water with the answer to this question (or go off on a large tangent), but here we go. Keep in mind also that this long-winded answer comes with a grain of salt, because every system will be designed to meet different goals. Therefore, there is no "one true way" as some would tout.

That being said, if you are looking to build a system for growth, there are certain concepts and technologies that should at least be considered from the outset. Service-oriented Architecture (SOA) is a way of designing a system for growth, to me it's the most natural way to begin with the journey in mind. For those new to SOA, a short primer: It involves creating smaller independent applications that are easier to write and maintain because they focus on smaller feature sets, while when roped together you can gain the benefit of all the systems working as a whole and ready to scale.

In this type of system we never want to share data between service applications directly, such as connecting from Service A to Service B's database to get user data. We share data by creating APIs for each service application (with protobuf of course :)), then publish those APIs for our other services to consume. If one application needs user data, it doesn't connect to the user database, it connects to the internal User service's API to gather the data. Naturally protobuf fits extremely well here, but REST/JSON or SOAP or (insert other transport protocol here) can obviously be used also.

Other "large systems" or so-called "enterprise" technologies that fit well into an SOA system are background jobs (queues) and various types of messaging systems.

Queueing is essential for the speed and scalability of a system as it offloads non-relevant (yet important) processing to seperate threads or processes. A simple example of how a queue can give you an increase in speed and usability of a system is sending an email when a user is created. The user generally doesn't care (or know) that you are sending him an email when their account is created, but they do care that if its taking 10 seconds. So rather than tie up the user's process just to send an email, you would queue that "job" for later (even if it's processed milliseconds later) and let the process return the result of the user creation. Workers in other threads or processes will pick up the email job and send the email for you.

The main queueing system we use is Github's excellent Resque coupled with my own little resque-remote plugin. Resque-remote gives us the ability to queue a job for another service to consume.

Messaging is such an enormous topic that I'm not sure I'm the one you want to describe its ins and outs. The short of it is that in certain contexts we've found that it can make more sense to use push-based data transfer rather than pull-based. Take the user creation example: when a user is created in my User Service Application, the user service doesn't know about any other systems that may be interested that a user was created, and frankly it shouldn't care. The User Service should only be responsible to post a message (to a message service or bus) that an event occurred in the system, in this case a user was created. Once the event is messaged, user service creation can go about its merry way. Other parts of the system may be listening to the message (event) bus for user creation events and their associated data, and they will receive the data as a push. This specific messaging paradigm is usually referred to as PubSub (Publish/Subscribe). As I've already mentioned, there are many many more types of messaging patterns that can be followed.

These are just a few of the systems we've put in place to manage data transfer complexity in our SOA ecosystem. There's also another branch for data warehousing such as ETL data transfer systems like Pentaho or Jasper. The possibilities are... well, you get the idea.

The coolest part about all of this is that you can use Ruby for 100% of these so-called enterprise situations. We do. You don't have to use Java or .NET to solve "Big Boy" problems. When I first started with Ruby, I wasn't entirely sure of this, but I certainly am now.


So, you've read along this far. What do you think? How are you using Protocol Buffers? Why did you choose to go down this route?

Posted by gnupate 3 comments

Wednesday, August 25, 2010

Ruby|Web Interview with Pat Maddox


Ok, if I'm going to post about GoGaRuCo today, I should also spend some time on Ruby|Web, the latest regional conference from Mike Moore (@blowmage) and friends — truth in advertising, I'm a volunteer on the board for Ruby|Web, so I might be a bit biased.
Just so my biases don't show too much, I asked Pat Maddox(@patmaddox) to answer a few questions for me. Of course, he's a speaker at Ruby|Web, so the spin is probably still there.
Regardless of our mind-set, this looks like it's going to be an awesome conference. The only problem is that you need to register before Sep 3rd. You don't have much time ... maybe you should go register first, then come back and read what Pat has to say.

Ruby|Web is a new name in the regional Ruby conference space. What drew you to it?
Pat I love coding in Ruby, and I think the web is a great platform to develop for. I'm eager to spend a few days rubbing elbows with like-minded people. And between the over all theme, the fantastic organizers running the show, and Snowbird (!!), Ruby|Web jumped to the top of my list. Plus I don't know anything about HTML 5 or CSS 3 and I need to get on that :)
Seaside is interesting technology, how did you discover it?
Pat Seaside is a fascinating and fun technology!! I came across it a few years ago, not long after I got into Rails. Over the years I had a couple of false starts with it...it's a bit opaque at first because the development environment is so different from anything I'm used to. And it's only fairly recently with Pharo that it's become easier to get started, because it's such a clean environment geared towards development. Also the documentation for both Pharo and Seaside are getting really good. There are free books on each at pharobyexample.org and book.seaside.st/book.
Okay as for what's so interesting to me about Seaside... it's 50% the framework and 50% the Pharo environment. Seaside itself represents a step forward in web development similar to how Rails did. Rails takes care of a lot of the plumbing for you - you don't have to parse query params, set up response headers, manage the session (unless you want to of course). Seaside does all that of course but also manages application state for you. So you don't have to worry about putting stuff into a database, then pulling it back out and operating on it. I can't do it justice in a few sentences, but that's why I'll be showing lots of examples at the conference! :) At any rate, that same feeling you get when you code Rails for the first time and see how much easier things are, you get that same feeling with Seaside. It's not a replacement for Rails by any means - Rails definitely has a sweet spot, particularly when it comes to RESTful websites and interoperability with the unix ecosystem - but for the things that Seaside is strong at (which for me so far has been complex and/or configurable workflows), it runs circles around everything else.
The other thing I'm loving about Seaside development is Pharo, an open-source smalltalk environment. Smalltalk is a great language, and Pharo has great tools that allow you to discover everything in the system. Honestly it makes RubyMine or etags look plain silly. The best bit is that nearly everything in Pharo is implemented in smalltalk, including all of the tools. So if you want to see the mechanics of a refactoring tool, and even build your own, it's trivial to do so, because it's just smalltalk code.
Wow this answer got long. I could go on all day about this stuff. Gonna stop now.
What other smalltalk tools/ideas do you think Rubyists should be looking at?
Pat Let's see...I'd love to see Rubyists take the ideas from the Pharo IDE and build some really snazzy development environment for Ruby. The next killer Ruby app, I think, is going to be a development environment that uses the runtime structure of objects to do all of its magic, rather than just statically analyzing source code. Even just having portable refactoring tools would be awesome.
I'm also really excited about Maglev, whenever that becomes available for daily use. It is incredibly liberating to write actual OO code, and so I think my style of coding Rails will change completely once Maglev enters the field. fingers crossed
Which Ruby|Web presentations are you most looking forward to?

Pat In order of them being listed on the sessions page...
  • BJ Clark's (@robotdeathsquad) talk on HTML / CSS / Javascript. He told me a few months back when he planned this talk that he thought, "if I were to school Pat on HTML / CSS / Javascript, what would I say?" He and I have worked together for years and he gets frustrated with my lack of understanding of those things. So basically this talk is geared specifically to people like me, hardcore backend developers with "div-itis" and who typically use inline javascript and CSS. I'm looking forward to getting schooled.
  • Alistair Cockburn's (@TotherAlistair) samurai talk. His talk summary means absolutely nothing to me (on purpose, I'm sure) but he's always a trip to watch speak, and I'm glad to see him get more exposure in the Ruby community.
  • Evan Light's (@elight) iOS talk - Evan is a diverse developer and entrepreneur. Really excited to learn from his experiences.
  • Joe O'Brien's (@objo) communication talk. For starters, Joe is one of my favorite people in the Ruby community. Again, he's one of those folks that combines technical expertise with good business sense and a warm heart. I think folks attending this conference are going to have more of the entrepreneurial spirit than most, so his talk will be particularly important and insightful for us.
  • Dirt Simple Datamining by Matthew Thorley (@padwasabimasala) - because really, who doesn't love datamining??
It's clearly shaping up to be a rocking conference!!!


Posted by gnupate 0 comments

GoGaRuCo 2010: mini-interview with Ilya Grigorik

Ilya Grigorik (@igrigorik) is another GoGaRuCo speaker who's kindly agreed to sit down and work through a short interview with me. Hopefully this gives you taste of what you'll be missing if you're not going to the Bay Area's regional Ruby conference.

Machine Learning and Ruby don't leap to mind as a common pairing. Why is machine learning important to Rubyists?
Ilya I don't think the topic of Machine Learning (ML) can or should be linked any specific language or runtime - it is much more general then that. Wikipedia provides a good starting point: "Machine learning is a scientific discipline that is concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data". Artificial Intelligence is a close cousin to this definition, and you will find a lot of people using these two terms interchangeably, but I prefer the ML definition because, to me, defining and modelling the learning process is where the work happens, whereas "intelligence" is the ultimate outcome (plus, defining intelligence is a much harder concept to agree on).
With that in mind, I think you could make the argument that Rubyists apply ML to many everyday situations already: sorting algorithms, recommendations, and so on. It is also a truism that by the time any "AI" hits the mainstream, it is usually no longer interpreted as "AI". For example, computers answering telephone calls was the domain of pure science fiction only a few decades ago, whereas now we don't even stop to think about it. How about your ITunes "genius" playlist? Pandora, Last.fm? You see where I'm going. It's all around us.
Why is Ruby a good fit for machine learning applications?
Ilya Because Ruby commands such a presence in the web development world, I think it naturally finds itself in domains and applications that stand to gain a lot by leveraging the available data in some interesting and novel way. But once again, it's not really a question of language, as much as it is a question of modelling what you know, and applying that data to interesting questions. If Ruby, as a language, allows you to model your data in a faster or easier way, then so much the better.
On a purely practical, implementation side, Ruby has a number of great libraries and plugins that allow you to leverage many interesting algorithms: support vector machines, decision trees, bayes filters, neural nets, and so on. Will those tools scale to million row matrices? Perhaps not, but they will allow you to iterate through a number of solutions at a minimal cost, which in itself is a big win.
Where doesn't ruby fit well in the domain?
Ilya It is unlikely that you will be analyzing a multi-terabyte dataset with a Ruby ML algorithm. More likely, you'll work on a scale of a gigabyte (or a few), model and iterate your algorithm on a subset of data with Ruby, and then implement a lower level solution to scale up to larger datasets.
You're pretty well know for deep diving blogs about Ruby. What's your day to day relationship with the language?
Ilya My day to day job is with PostRank, where being CTO/founder means I'm wearing many different hats throughout the day. Having said that, most of our systems are written in Ruby, and we have definitely pushed the limits of the language on many fronts.
My blog, is in many ways, a reflection of the technical challenges we're currently dealing with at PostRank, or technologies we're evaluating to improve our infrastructure. So, while I may not be working on implementing the next feature which is going into our analytics product, I am likely to be involved in the design and deployment of the infrastructure that has to deal with servicing all of the data requests required to make that feature possible (and when you're pushing as much data around as we do at PostRank, that's always a non-trivial challenge). The combination of having an awesome team, and a large and exciting problem to work on means there is never a shortlist of what to write about out on my blog!
Ilya
Other than your own talk, what are you most looking forward to at GoGaRuCo this year?
Ilya To be honest, every single talk on the agenda sounds fascinating to me - it's hard to pick any favorites. Having said that, I have been recently thinking and talking to a few people about the topic of "test driven learning", so I'm really looking forward to the "Test First Teaching" presentation by Sarah Allen and Alex Chaffee. I am really curious how this concept could be applied more broadly, outside of just learning a programming language. For example, could you structure a physics course in the same manner? Arguably some (great) teachers do this already, but I would love to extract and distill some general rules and patterns.
Posted by gnupate 1 comments
Labels: ,

Friday, July 16, 2010

Lone Star Ruby Conf Speaker Interview: Nephi Johnson


Okay, time for a second interview with a Lone Star Ruby Conference speaker. This time, Nephi Johnson (@d0c_s4vage) talks a bit about his presentation — "Less-Dumb Fuzzing and Ruby Metaprogramming".



Fuzzing isn't always well understood. Can you describe fuzzing, and tell us what situations it's a good fit for?

Nephi Fuzzing is a term used to describe the process of feeding an application unexpected inputs in order to find flaws in the code. During development, it's pretty much impossible to write code that will handle all possible inputs correctly. Fuzzing helps to uncover some of the more subtle and unforeseeable flaws that haven't been found through code reviews and normal testing. Fuzzing is typically very automated and usually involves feeding a program thousands or millions of sets of malformed input. The program being fuzzed is then monitored for crashes, exceptions, and/or performance.

The last person to really talk about fuzzing in the Ruby Space was Zed Shaw. That's kind of a tough act to follow. Why is this an important topic for Rubyists, and why are you the right person to talk about it?

Nephi I think fuzzing is an important topic for every developer (or anyone who wants to find bugs in an application). If you have the time to fuzz a product, you will almost certainly uncover flaws in it. One bug fixed during development is one less bug that customers have to experience with a release product. Not finding flaws in the code after extensive fuzzing is also a big confidence booster. I think this is especially applicable to Rubyists because of the flexibility (and fun) that comes with using Ruby. I think anyone could make their own fuzzer/data-generator with Ruby in a short amount of time. Also, if someone wanted to use the library I've written, it just so happens that I've written it in Ruby [*sarcasm*].

Why am I the right person to talk about fuzzing? Fuzzing is something that I spend most of my free time doing or working on. I put a lot of thought into coming up with ways to more efficiently fuzz programs. As a security researcher, I have different goals in fuzzing than developers. I want to find the really interesting bugs, bugs that might allow one to run their own code or do something entirely unexpected with the program. I think my perspective on fuzzing might provide different insights for those who use fuzzing outside of the security field.

What prompted you to speak at LSRC this year?

Nephi Someone had mentioned to me that it might be interesting to hear from somebody in the security field and suggested I submit a talk. I liked the idea, chose to talk about a project I've been working on using Ruby, and here I am.

Other than your own talk, what are you most looking forward to?

Nephi I'm looking most forward to the talks "Vim for the modern Rubyist", "What every Ruby programmer should know about threads", and "Getting Started With C++ Extensions." Why these talks? I love using vim - it was the first text editor I used when I started using Linux and now it's all I use, threads give me trouble sometimes (ruby threads, that is), and I've been wanting to write my own ruby extensions for a while now.

Thursday, July 15, 2010

LSRC Speaker Interview with David Copeland


With the Lone Star Ruby Conference just over a month away, I thought it would be a good idea to talk to some of the presenters. David Copeland (@davetron5000) is giving a talk about a topic that resonated with me, so I sent off an email to find out more about what he thought would make his presentation and the conference worthwhile.



I've never been a big 'web app' kind of guy, so I was excited to see "Why And How You Should Make Awesome Command Line Apps with Ruby" as a presentation. Why do you think Ruby works well in this space?

Dave Having used PERL and bash in the past, Ruby is just a FAR more pleasant environment; it's just really easy to make a well-designed system, the code is clearer and easier to write, and there's a lot of great libraries that are easy to install and set up. My talk doesn't go TOO heavily into this, but I think everything that makes Ruby great for web apps makes it great for command line apps.

As a Sys Admin, CLI stuff is my bread and butter. Do you see Ruby as a good language for Sys Admins? Why or why not?

Dave Nothing's going to be "as close to the system" as shell scripts, but Ruby has some great libraries that let you write cross-platform scripts, and that's a good thing. Ruby also has a culture of terse-but-readable syntax, code-as-configuration and overall UNIXness that I think a sysadmin would find familiar and comforting. Ruby really embodies the "motivated laziness" that is the hallmark of a good sysadmin (by which I mean automating painful tasks away into something simpler). Further, there's some great management tools built with Ruby, like chef and capistrano.

What prompted you to present at the Lone Star Ruby Conference this year?

Dave I've spoken at a few conferences and user groups and really liked it, and I liked the idea of a Ruby-focused conference; Java-based conferences always feel a bit behind the bleeding edge to me, and the Ruby world is always pushing the boundaries. I also thought my talk would be interesting to share, specifically for the reasons you note above; Ruby and Rails go hand in hand, but Ruby is an awesome language all on its own. Plus, the last time I was in Austin, I was there for one night on a cross-country drive, so I didn't get to see much :)

Besides your own talk, what are you most looking forward to while you're there?

Dave There's a ton of really interesting looking talks scheduled; The ActiveModel/Active Relation talk looks good, as well as the NoSQL stuff and deployment talks. And, I'm sure Tom Preston-Warner from github will give a good talk!

Tuesday, July 07, 2009

Finding or Keeping a Tech Job -- An interview with Andy Lester and Chad Fowler

Andy Lester (@theworkinggeek) and Chad Fowler (@chadfowler) the authors of Land the Tech Job You Love and The Passionate Programmer, respectively, agreed to do a joint interview with me. It was a lot of fun to talk with these guys, I hope you enjoy reading this interview as much as I did doing it.


Your books look like great companions to each other. Did you interact at all when writing them?

Chad We didn't interact much, no. We actually "met" each other because of the fact that we were writing complementary books. It was good fortune more than anything else. Surprisingly, we don't overlap much in content despite the lack of a coordinated effort.

Andy I was happy when I found out that Chad was updating "My Job Went To India", because that book was one of my reasons for writing my book. It was inspiring to me to see an author who had such a positive, proactive way to look at one's career. I remember reading it and every page there was one of those "Yeah, that's exactly right!" moments.

There are a lot of people who hack code, fewer who hack communities, and still fewer who hack themselves. Your books (along with a few others) seem aimed at this last group. Why do you think people are more willing to work on code than on habits, health, or career?

Andy Because it's uncomfortable to admit that you might have areas for improvement. When you're improving code, you're working on something with no feelings. Besides, cleaning up code, even your own, shows positive results in only a few minutes. Changing your habits takes time, and is difficult, never mind having to admit that you might be imperfect.

Chad I think there's also a common belief that we can change everything except ourselves. That's we're somehow stuck with the "self" we were born with and it's a static thing. It's a self-fulfilling misconception in that the commonly held nature of the belief makes it indeed harder to change ourselves than to change the things around us. But paradoxically, your self is the one thing you have complete and total control over.

It's also scarier to tackle big stuff like health, habits, and career than it is to work on relatively inconsequential things like code. I semi-recently read The War of Art and my major takeaway was that we tend to procrastinate the things that are most important to us, because we're afraid of tackling them. So, says the book, you can figure out which things are important by looking at which big things you're avoiding. Interesting idea. Not universally true but I've found it valuable to keep in mind anyway.

I think doing things intentionally is an underlying theme in both books. The stress of day to day work, or being out of work often pushes our intentions out the window. What advice can you give us about keeping focused on them?

Chad One thing I've learned over years of putting a lot of thought into how to best manage my career is that I am always going to be "too busy" to do the important things. I think that's true for most people. Parkinson's Law applies outside the workplace just as well as it does to an individual project or set of tasks. Almost everyone I know is "busy" no matter what they've got going on.

Andy It's all about Quadrant Two.

In The 7 Habits of Highly Effective People, Stephen Covey draws a square with two axes: Importance and Urgency. Things that are important and urgent get done automatically. Things that aren't important shouldn't get done. That leaves one quadrant, Quadrant Two, which is where you find activities that are important but not urgent. The problem is carving out the time out of your day or your week to do those things.

Parkinson's Law takes hold here as Chad says, but we also find that it's amplified by urgency of everything else. The new website has to go live by the first day of the trade show, or a security patch sends us scrambling to update a dozen servers, or your boss needs a new report by the end of the week. All that urgency leaves us drained, but ignoring the Quadrant Two activities that allow us to move quickly when things get crazy only makes the urgency all the more stressful.

Chad So it's really a matter of prioritizing (sorry for the obvious answer). One thing I do not advocate is cutting down on fun, relaxation, health, or family time. Those are the natural things to let slip when you're faced with stress (after you let career development slip). But when you sacrifice the "living" part of life, you burn out fast. And from my experience, burnout is the fastest ticket to mediocrity.

So if burnout is the fastest way to being unremarkable, it's the thing we have to attack most ruthlessly. How do you do it? I don't have a definitive answer but my current philosophy is that if you're faced with too much to do and are stressing, you should take inventory of all the stuff you don't want to do and stop doing those things.

It may sound like I'm advocating laziness — and I am, but in the same way Larry Wall does. Some of the stuff we hate doing really isn't worth doing. We can just stop doing it. But most of what we hate doing still needs to be done. As programmers, we have a unique advantage here: automation. We don't have to rely on tools others have created to automate our work. We can do it ourselves.

Andy As a long-time Perl user and advocate, it's no surprise I love Larry's view on laziness. The corollary to laziness is the idea that "machines should work, people should think." Any time you're spending doing some sort of work that the machine should be doing is wasted time, because you're spending your brain power, which is incredibly valuable, doing things that can be done with computer power, which is cheap and getting cheaper every day. We have these amazingly powerful tools, if we just put the time and mental effort into making them do even more.

The barrier to entry, however, is the unwillingness to increase our knowledge to let us know how to make the computer do a given task, or take the time make it happen. Larry calls this "false laziness." If you've ever said "It's faster for me to retype this than to hack together a conversion tool for this data", that might well be false laziness. What about the next time you need to convert similar data? It doesn't take many iterations before you say "Geez, I should have spent the time up front."

It takes determination to get over the hump, but after a while you get into the groove and you find yourself saying "I don't mind doing extra work up front because I know I'll get more brain cycles back days down the road." More free brain cycles = less burnout.

Chad If you start automating everything in the workplace, you'll not only make your life better, freeing yourself to focus on the important task of leveling up as a software developer. You'll also save your company money, reduce the turnaround time for tasks, and reduce the chance of human errors.

I think it was Martin Fowler who said, "If you can't change your organization, change your organization". How do we choose between investing in ourselves (in place) and investing in finding a better place?

Andy It's simple, if not easy, cost-benefit analysis. The problem that most people run into is not knowing what benefits that they are looking for. If we only go by reflex, we might look at salary, perceived job security, and technical specifics of the job that we have and the one that we might be going to. Unfortunately, those three factors are rarely the only relevant issues. What about your co-workers? Work/life balance? The industry you'll be working on?

The one crucial point I tell anyone who is unhappy with a job and looking to go elsewhere is to make sure that when they finally look to make the jump, that they are moving to somewhere good, not running away from somewhere bad. Running away leaves in you a position of weakness. You're more likely to make desperate moves, take unnecessary risks, and accept jobs or salaries that you would ordinarily turn away.

Chad Andy's point about going to good vs. running from bad is profoundly right.

I'd add that it's easier to hope for external forces to change your situation than to change your own situation. So I'd recommend starting with the assumption that there is a better way to perceive and respond to any work situation until that proves to be incorrect.

Also, I think a lot of "knowledge workers" tend to develop an unhealthy attachment and relationship to their employers. We come to think of a job as a place to go and live. Therefore we establish a sense of entitlement about how things should be and how "fair" work life should be. Ultimately, the employer/employee relationship is a series of business transactions. It's not a family or a home. Remembering that it's a business relationship can help you make better decisions both for yourself and for your job.

Lots of people don't understand the technology job space. What makes it so different than working in other industries?

Andy As far as job hunting, your peers are as smart as you are, and you're held to a higher standard than most other industries. The hiring manager is probably as much of a geek as you are, and is going to scrutinize you like the detail-oriented person she is. We also understand the Internet in a way that other professionals usually don't. For example, I've seen plenty of articles for non-techie job hunters that warn that your online activity could be checked out by a potential employer, so watch out with those drunken frat party photos on MySpace. Show that article to anyone in our industry and he'll say "Duh, of course."

We also have to perform at a higher level to show that we can do the job. It's not enough to come in for an interview, answer some questions and hope you get picked. You need to show that you can do the job, either by showing prior work that you've done, or by telling compelling stories about your background. If you're not going to take the time and effort to step up the level at which your job hunting, someone else will, and you'll be shut out.

Chad I don't think it's very different from other "knowledge work" industries. The one big difference is that technologists (particularly programmers) have the ability to build up a portfolio of work that anyone can use. Software is everywhere, so it's possible for a programmer to create something that can touch literally anyone's life. And software doesn't cost per-unit like, say, a piece of furniture does. This means that if a software developer creates something on his or her own time, it can be shared at no cost with anyone they want to share it with. From the perspective of the job market, this is a powerful tool. There aren't many industries where a candidate could give a piece of their work to every potential employer to actually keep and use. I think programmers should take advantage of this and create Free software to distribute as part of the job search process. (There are countless other benefits to creating Free software that I'm not focusing on here obviously).

Andy Samples of your work are crucial to the job search. I think that anyone hiring programmers who doesn't see samples of the code written by the candidate is crazy. You wouldn't hire a chef without tasting his cooking, so why are programmers different? Ten-minute exercises at the interview like "Write a function to do X" may weed out the bottom feeders, but seeing a sizable chunk of code tells so much more. I can tell so much about a programmer by reading five pages of code that verbal discussion just doesn't bring out. How does she name her variables? Does she create beautiful code? What gets documented? Has the code been maintained, or is it a pile of hacks and bolt-ons?

Even if the hiring manager doesn't ask for code samples at the interview, bring them with you anyway. Seeing samples of your good work lower the risk in the mind of the manager. Given two candidates who are roughly similar in skills, but one can show evidence of his working abilities, who do you think the manager is going to pick?

The problem with code samples is that many people are not at liberty to disclose they've worked on in their jobs. It doesn't have to be military contractors or stealth startups to run into this problem. Working on Free Software or open source software is your way around this. You can work on existing projects or start one of your own, and your code is available for anyone in the world to see, especially your future employer.

If you've read each other's book, what's the best advice you took from it?

Chad What I like about Andy's book is how tactical and detailed it gets. For example, his description of the actual interview process is spot on. For those of us who haven't gone through interviewing in a long time, it presents a clear walk-through of what to expect on a real interview. It's like actually being there but with Andy sitting by you giving advice the whole time. For me, I can imagine that taking the pressure out of the situation if I were nervous about an interview. If you follow all of his advice on the interview day, you'll likely be one of the most prepared candidates the interviewers ever see.

Andy The advice about making sure that you're the worst guy in the room is tough. It struck me when I first read it in the first version, and I try always to remember it.

The idea comes from jazz guitarist Pat Metheny's advice to "always be the worst guy in any band you're in," because you'll be surrounded by better players and will naturally play better and will learn along the way. When you're the best player in a band, you're less likely to learn from others.

I find myself mostly applying this to open source projects, where I'm surrounded by fantastic programmers from around the world. I need to remember to appreciate the skills of those around me, and learn as much as I can. The tough part is that it's so intimidating.

A lot of hackers find the non-programming aspects of our jobs unpleasant or worse. What's been the hardest non-programming task to adapt to for you?

Andy As always, it's dealing with other people, and remembering the robustness principle of "Be conservative in what you do; be liberal in what you accept from others" when dealing with others from work.

For being conservative on output, it's always tough to remember that our geek argot and our overly direct way of saying things can be off-putting to others. It's especially dangerous because one slip of the tongue can leave you marked as a jerk for quite a while.

Easier, although still frustrating, is being liberal in what I accept as input. Say I've got someone who's reporting a bug in the software, and he says "Yeah, I tried to update a batch, and it didn't work," and I've got to go through the process of "How did it not work" and "what specifically did you try" and all the classic debugging questions. When I first starting programming, I'd be so mad that the person I was talking to didn't say all the right things, or didn't have my thought process. It took a while to learn to accept that.

Now, these are minor annoyances, but they don't bother me. It's like being annoyed at the rain, but not letting it ruin my day.

Chad For me it was probably before I became a programmer. But it still applies, because it happened while I was a hacker of a different sort: a musician. At nights I did my "real" career as a professional saxophonist. In the mornings, I had an extra job as a forklift operator. I actually loved both jobs. Some days I even preferred the fork lift job.

I was really good at the fork lift job. I got great satisfaction as a part time contractor at beating all of the full time employees on the truck dock where I worked in productivity every day by a significant percentage. I would basically get to work and run for my entire shift, either on foot or virtually on a forklift. It was a rewarding job and the bosses were taking notice. They wanted to bring me on full time, which would give me benefits and a great deal more pay. Especially given my musician's living, it was a tempting offer.

But I was painfully introverted. I was so shy, in fact, that it was almost a problem on the truck dock. People constantly took shots at me because I was such a pushover and so obviously uncomfortable around people.

I knew this was going to be a life-long limitation if I didn't do something about it.

So I quit my beloved fork lift job and started working as a waiter.

It was a miserable experience. I panicked daily at first having to interact with group after group of strangers while also juggling their orders, special requests, and (physically juggling) their plates. I was the worst waiter I've ever witnessed. I would sometimes leave work with less money than when I started due to the way waiters have to give a percentage of sales to the supporting restaurant staff.

But over time, though I never really became a passable waiter (great respect to all of you who have ever done that job successfully!), it was that experience that gave me the comfort to interact with people that has probably been the single most important career development move I will ever make. I threw myself way out of my comfort zone and have become the sort of person who is (hilariously) described as "the most extroverted programmer ever" and that sort of silly thing. It's been the key to my success in corporate environments as well as the change that enables me to do things like organize and speak at conferences, give training, do on-site consulting for clients, and basically most of what I make my living at now.


What about you? What's been the hardest thing for you to pull into your work-a-day lives?

Posted by gnupate 3 comments
Labels: ,

Tuesday, June 30, 2009

Ruby Hoedown 2009 mini-Interview with Jeremy McAnally

Jeremy McAnally (@jm) is a good friend, and we've worked together on regional Ruby conferences and other projects. With the Ruby Hoedown looming, I thought it was about time to sit down with him for a mini-interview about his free conference.


How have the community and your sponsors responded to making the Ruby Hoedown free?

Jeremy Everyone has largely been in two camps: "Wow that's awesome!" and "What? Are you crazy? How are you going to do that?" I find myself somewhere between the two.

You've even announced a plan to help encourage registration, where did this idea come from?

Jeremy I was thinking about how to get more sponsors involved and how to get them involved with everyone more effectively. Book giveaways and things like that are great, but then only 5-6 people are getting to benefit and the sponsor is really only "touching" those people. Around that same time I was trying to think of another way to differentiate the Hoedown and add value to the conference, and the ideas sort of gelled together around the time one of my co-workers was talking about the MacHeist bundles he'd gotten this year.

You're getting really close, but this is the hairiest time as you get all the last minute stuff moving. How are you feeling about this year's Hoedown?

Jeremy I feel great. There's actually a lot less hassle this year than last year, so it's been going pretty smoothly thus far. I may not feel the same way in a month or so, but for now, I'm feeling good!

There are just hours left for people to get proposals in. Is there anything specific you're looking for?

Jeremy Anything even vaguely Ruby-related. I love a good technical talk, but I'd also like to hear why pomodoro timers are the best things since sliced bread or why eating celery can help improve not only your test coverage, but also your overall quality of life.

Wednesday, June 24, 2009

A Ruby Couple: Interview with James and Dana Gray

It's another week without a Questions Five Ways discussion, but I've got another great interview that more than makes up for it. James Gray (@JEG2) is very well known in the Ruby community. His wife, Dana, is less well known, but won't stay that way for long. Fresh from her Ruby presenting debut, a lightning talk on Ruby regular expressions at MWRC, the two of them are embarking on a joint training session at the Lone Star Ruby Conference.

I asked the two of them if they wouldn't mind doing a short interview with me, and am really happy that they agreed. Here's what we covered.


Dana, if I understand correctly, you gave you first presentation at MWRC this spring. Now you're teaching a class with James. That's a pretty steep curve. Other than having a great team member, what's helped you navigate it?

Dana I think it goes without saying that James is an amazing mentor and a very gifted programmer and without his patience and enthusiasm, I certainly wouldn't be where I am now. But you asked aside from him, so...

Ironically, teaching comes naturally to me, especially in the tech world. Before I came to work with James, I was, among other things, the software trainer for sales reps at a big food company. I was responsible for not only developing the software system for managing their business relationships but for teaching them how to use it as well. So training at a tech conference where I get to teach people who already speak my language, well, that seems pretty much like having cake and getting to eat it too.

I think the other thing that has really helped me grow as a programmer is simply doing it every day, five days a week. If I don't work and learn every day, I don't pay the mortgage and that is a pretty strong motivator. This industry wakes up in a new world every other month or so and to survive you have to be willing to learn new stuff all the time. I'm just glad LSRC gives me the opportunity to share some of what I've learned with others.

James, you've been a member of the Ruby community for really long time, and you've done a lot for it. Which experiences stand out as things you want to pass along in your training?

James The Lone Star Ruby Conference trainings, which this will be my second year doing, are a neat opportunity for guys like me. I'm really just a pretty average programmer, but it turns out that I can teach a little. In fact, I use to teach chess for a living.

I have seriously looked at doing private trainings where I live in OKC, but that's a really big gamble. To get it going I would need to commit a lot of resources and then just pray that 20 or so people would be willing to fly out here for a few days and pay what I'm asking. Otherwise, I could take some pretty heavy losses. I haven't been brave enough to try that yet.

At the LSRC though, all I need to do is show up and teach. Plus with hundreds of programmers coming in, surely a handful will be interested in taking the class. It's easy for them too. LSRC, which I jokingly call The Foodie Conference, will even feed all of us. That makes it a great fit all around.

How did the two of you come up with the idea of putting on a training course together?

James I did a training at last year's conference with Gregory Brown called The Ins and Outs of Ruby I/O. It being our first big training, we did make a few mistakes. The main error was that we only asked for three hours since we were scared to commit to a full day. Then we planned about two days worth of content and didn't end up getting to everything. However, we had a terrific group attend. Even Matz dropped in and helped answer some questions. The end result was that it turned out pretty great, in my opinion.

This year Gregory and I both had good ideas for new trainings and we couldn't decide what we liked better. So we found more help and split into two teams. Gregory, with the help of Brad Ediger, will be doing a pair of trainings inspired by Ruby Best Practices (his new book). Dana and I joined forces to deliver the Moving to Ruby 1.9 Workshop.

We have all been waiting for Ruby 1.9 to be ready for the road for so long, I think some of us have actually missed the fact that it has happened. It officially has a production release, Rails runs there now, and we're definitely seeing people start to make the move. The time has come. It's a big jump though. A lot has changed. For example, I was able to write a series of eleven blog posts about just one of the big changes. The training gives us a great opportunity to dig into all the new stuff and show people both what you have to know and also just how you can use the new features to improve your Ruby. Hopefully that makes the training a good source of knowledge that a lot of us are looking for right about now.

Dana Last year at LSRC, I took James and Greg Brown's training and was excited to see how well it was received. It was pretty different from my previous training, where the sales reps I trained didn't really get a choice. Here were people who not only volunteered to take this class, they paid to do so. So when James said Greg was thinking about doing his own training, I volunteered to help James give his full day training. I was excited by the idea of being involved in the knowledge-share of Ruby's future. And it is a good opportunity for me to learn about Ruby 1.9, since I have to teach it. :) Besides, my experience as a trainer will help James stay on track. He tends to hot air balloon. :) I'll help develop and break down the material into digestible pieces and lots of hands on labs.

James She's right, I need the help.

That was probably the second mistake of last year's training: we built it too much as a huge brain dump from Gregory and myself. The attendees endured and even steered us a bit with their discussion, but it was definitely an endurance test for them. This year, Dana and I are planning a much more interactive environment that will be much better for relaxed learning.

What other joint Ruby activities do you foresee yourselves working on?

Dana Most likely more conferences and running the OK Ruby users group. Not to mention running our company, which keeps us plenty busy.

James That's a great question I'm not sure I know the answer to. Dana and I are best friends who really enjoy each other's company, in addition to being married. We are those rare people that can spend all day in each other's company and not get tired of ourselves. That makes us well suited to work together.

For now, we've been traveling to the Ruby conferences together, working on our speeches together, and obviously building applications together. Now we have this training to teach together. Who knows what we will find to try next.

Dana doesn't do as much open source work as I do, but I do have one project I keep hoping I'll be able to drag her into eventually. . .

There are a lot of under represented (or misrepresented) groups in the Ruby community (and other tech communities). The two of you seem to be doing really well. What are some of the keys to your success?

James My secret weapon is actually Dana.

I ran this business for many years on my own. It did OK. It took care of me and things gradually improved. However, when Dana came to work for me, almost exactly a year ago now, is when things really took off. We have all the work we can handle, we're working on some killer cutting edge projects, and we're making it to more conferences. I guess she's what my business was always missing!

Dana I think the biggest key to our success is simply us. James and I are very close. We work well together in just about everything we tackle, from buying a new car to running a business. We understand each other and we listen to each other. We are intimately aware of each others strengths and weaknesses and I think we are extraordinarily lucky that we compliment each other like that. We work with some great people out there, like the guys at Highgroove, which helps us focus on the things we like to do. And we stay away from anything that we feel is not a good fit for our goals. We work hard but try not to get too focused on work at the expense of anything else.

James We did make some good decisions when we built our programming business. My mother does accounting for small businesses and she really helped me out there. You would be surprised what a difference that makes. I see people around me frequently make classic mistakes she steered me away from right from the beginning. It's hard to be successful when you start off with big disadvantages. I think I could teach another class just on the right ways to build a small business.

Posted by gnupate 3 comments
Labels:

Friday, June 19, 2009

People Behind GoGaRuCo, Josh Susser

I didn't get a Questions Five Ways discussion done this week, hopefully I'll get that back on track next week. I did finish up another project I've been working on for far too long though, an interview with Josh Susser (@hasmanyjosh), one of the GoGaRuCo organizers. Josh is a longtime member of the Ruby community, and a very smart guy. I'm grateful that he took a the time to talk with me, I hope you enjoy this interview too.


What motivated you to organize a regional Ruby conference?

Josh I started joking last month that if you don't think you're getting enough email, you should organize a conference.

Actually, I got the idea for doing a Ruby conf in SF about two years ago. We've had some other Ruby events in the area, but with all the stuff we have going on here in SF it seemed there was a huge community that was being under-served. It was almost embarrassing that all these other cities had awesome local confs, but we didn't. I also wanted to do something to help energize the local Rubyists. I think I felt like I wasn't getting enough out of the local community, and I feel that the best way to change that is to put more into it yourself.

Who else is/has been involved in organizing GoGaRuCo?

Josh Leah Silber has done most of the heavy lifting for organizing the conference, dealing with sponsors and logistics. I've been focusing on the technical program and wrangling speakers. We're a pretty good team. We've also had a lot of support from our employers, Pivotal Labs and Engine Yard - while they aren't technically producing the conference, they are doing a lot more than sponsors usually do for a conference. And we had a great team of volunteers too - without them we would have had the lamest conference ever.

Now that you've recovered from the first go-round, are you ready to get back on the horse and hold GoGaRuCo 2010?

Josh Sure, both Leah and I are up for that. We're really happy with how our first year went, but already have ideas for how to make next year even better. We'll probably expand from 200 to about 250 people because you can do more that way, but we want to stay intimate and single-track.

Who's the target audience for GoGaRuCo? How are you reaching out to them?

Josh We are definitely focused on people who write Ruby code for their living, or at least for a lot of their hobby time. Some shows have a lot of content for managers, VCs, and business people, but that's not us. If you're not a programmer, you might find this conference boring. But if you are a programmer who loves writing Ruby code, this is the conference you don't want to miss. We also assume that by now we don't have to have a lot of introductory content. There's a place for that kind of material, but we felt for what people were paying they deserved to see advanced material, and stuff that hasn't been seen anywhere else.

Reaching out has been pretty easy. We marketed the conference with blogs, Twitter, announcements at local meet-ups, and posting to email lists. Basically word of mouth and social networking tools.

A lot of regional Ruby conference organizers tout something unique about their conference. What makes the GoGaRuCo special?

Josh I think San Francisco is pretty special! And we tried hard to put together a program that represents that SF character. It's not just about improving your technical skills, but also what you use those skills in service of. So we had a couple of talks that weren't as much about the technology as they were about what you can use all this amazingly powerful technology to accomplish, and what kind of impact that can have on people's lives.

We also tried a few experiments with how we ran the conference. Things we'd always thought we'd like to see, improvements we wanted to try out. One of them was how we put together the program. We didn't do the typical CFP then have a couple people select from the proposals. I knew from the start there were certain people I wanted to present, so I just invited them to speak. That filled up half the program. The other half was selected from proposals through voting by registered attendees. I think that got us some speakers that otherwise might not have submitted proposals or had them chosen by a small committee. People really liked that idea and the program seems to have benefitted from it. We did get some surprises in the voted talks, which can be good or bad depending on what you were expecting. So far, the feedback has been mainly positive. We rushed the whole voting thing, but with a little preparation I'm sure we can make it work much better next year.

I loved the GoGaRuCo Wrap ... Where did the idea come from? Has it had the payoff/impact that you were hoping for?

Josh Thanks. The idea for the Golden Gate Ruby Wrap came when we were thinking about doing color printed programs, which everyone seems to do but costs a lot and has low value for both attendees and advertisers. You don't really need a program for a single-track conference anyway. But you do need some place to put the sponsors' ads! So we thought we should do a PDF zine to save paper and also create something of lasting value. It's good for everyone, including the advertisers. Instead of 200 people looking at a program for 30 seconds then throwing it on the floor, many more people get to spend a while reading the zine and will keep it around for reference, maybe for a long time. We got some very good write-ups for the talks, have some great photos taken by Andy Delcambre, and kept the focus on the community by including interviews with a number of attendees. So far we've had over a thousand downloads, so I think the impact is already pretty good. We didn't really get to include everything we wanted to, but next year we'll know how to do it better. I'm also hoping the idea catches on and we see more conferences produce what amounts to proceedings-in-brief. We don't mind folks copying here us at all.

It seems like the regional Ruby conference field is getting congested. What, if anything, should we be doing about it?

Josh That seems like asking what we should do about there being so many reality TV shows. I don't think there is a "we", and there isn't anything to do either. The field of regional conferences is an ecosystem, and those things tend to take care of themselves. Rubyists will go to the conferences they like, sponsors will help fund the conferences that are worth their support to them, and those conferences will continue to be produced. The others won't. We've seen that happen already. I think it's a positive thing, evolution selects the ones people want to see continue.

What other language(s) would you like to see have a presence at regional Ruby conferences? (Or should we be soley focused on Ruby?)

Josh I think keeping the main focus on Ruby is what people want in a Ruby conference, but there are certainly complementary languages and other technologies that can add to the value people get from Ruby. Nathan Sobo's talk on Unison and June was a great example of how Ruby and JavaScript frameworks can work together to overcome a difficult problem. Ruby doesn't exist in a vacuum, so it makes sense to show things that can help you use Ruby better.

If someone wants to get a regional conference started, what advice would you give them?

Josh The thing nobody ever tells you about running a conference is how hard it is on your body. I've said that to other organizers and they all smile painfully and nod. It's also demanding on your personal life, and can be really hard to work out with your job too. So I'd say you really have to want to do it, and you have to be ready to make a few sacrifices and deal with a lot of stress to make it happen. But it's also really worth it, so I'm not complaining.

The most important advice is that, like anything worth doing, you can not do it alone. It's crucial to have a good core team of two or three people who can keep things moving, and a larger team of people who can help get things done.

And lastly, make sure you have fun or you're not doing it right. Leah and I took this on to create the conference we ourselves always wanted to go to. If you're going to do something this big, it has to be an expression of yourself and what you're passionate about. Don't just try to be like everyone else, but use your own perspective and creativity to make something that is your own. Otherwise, you might as well go to someone else's conference.

Click here to Tweet this

Friday, June 05, 2009

Feedzirra and Typhoeus: an Interview with Paul Dix

With my recent Questions Five Ways series, I've gotten away from my regular interviews. To break that dry spell, I've get a pair great interviews for you. The first is with Paul Dix (@pauldix), the developer of Typhoeus and Feedzirra and budding author (see below). He and I had a good talk about how he builds such great libraries. Read on, I think you'll like it.


What kind of hacking do you do when you're not building cool new libraries for the community to play with (and argue about)?

Paul At work I'm doing Ruby stuff with a mix of Sinatra and Rails. The company I work for, kgb, is building a new web product in the aggregation/search space. That area has been my primary focus for the last three years so I get to play with stuff like machine learning and natural language processing. We're a bit early on so I'm not sure yet what I'll be using for that. Probably Java because of the availability of libraries, more speed, and better memory use/cleanup. Six months ago I did some research into the methods being used by competitors in the Netflix prize. It was really interesting stuff and I did most of the heavy lifting in Java. I started out with Ruby, but found that I needed a little more efficiency. However, I still did all my scraping and data preparation in Ruby.

During my free time I'm not getting to do much hacking these days. I'm working on a new book for Pearson called Service Oriented Design in Ruby and Rails so that takes up quite a bit of time. However, things like my new library are directly related to my day job and the topic of the book. So free time hacking along those lines will probably be fair game for the next six months.

Have you considered using JRuby to make the bridge between Ruby's accessiblity and Java's libraries easier to cross?

Paul I definitely have considered JRuby and I've used it before. However, with machine learning tasks it really comes down to speed of execution. Some jobs can take minutes to hours even with Java. Speedup factors of 2-10 times or more matter in these cases. Of course, I pulled that number completely from thin air. Depending on how it's used JRuby may be a viable competitor to Java, but I'd have to test to make sure. For interfacing with Java libraries it's definitely a good option.

How did you discover Ruby? Why have you stuck with it?

Paul In 2005 I was working as a C# programmer at McAfee. I had arranged to quit work and go back to school in the fall. Since I didn't have to worry any longer about making my living day to day in the Microsoft world, I decided it was a good time to switch up languages. My first stop was Python. I read Mark Pilgrim's Dive Into Python [now available for Python 3] and wrote a few scripts. I thought it was ok, but kept looking around and found Ruby.

I kind of picked up Rails and Ruby at the same time. I read through the first edition of Agile Web Development with Rails [now on it's third edition] and Programming Ruby [also in a third edition]. I liked it, but that's not what really made me stick with the language. I had also moved to NYC and started going to the NYC Ruby users group (nyc.rb). I think that's what really made me stick with Ruby. I enjoyed the community and how passionate everyone was. The development gains with Rails didn't hurt either.

It sounds like finding Ruby was something of a process. I'd assume you're still looking at languages as the process continues. What other languages have moved beyond catching your eye and are holding your attention?

Paul I try to be deliberate about what skills I focus on and languages are definitely a part of that. At this point nothing is really holding my attention as a replacement for Ruby. I haven't seen anything that fits that bill so instead I'm looking for supplemental languages that cover Ruby's weak spots. I'm playing around with Scala at the moment and using it to implement a something at work that will be very high traffic. I think languages geared towards parallel, distributed computing are very interesting right now.

How did you get interested in Ruby's http and xml performance?

Paul I've had an idea to build a feed aggregator/search service for a while now. Most of my free time hacking has revolved around playing with different pieces of that system. Obviously, a big part of that is pulling stuff down from the web and parsing it. I got even more into the HTTP thing with my current job. The system we're building is based on different HTTP services written in a variety of languages. I want Rails to be the front end for these, so a performant HTTP library is the first step.

Could you walk us through your process for building highly performant code?

Paul Generally I start out with something that I think is slow. Like feed parsing, for instance. I didn't like the existing libraries and I thought it could be done better. Luckily, Nokogiri had recently been released and was touted as super fast (which it is). From there I wrote a spike. No TDD, no elegant design. Just some ugly spaghetti that will actually do the basic task. That's enough to wrap a benchmark around so I run that. If my approach is fast then I figure I've proved out the concept. From there I rewrite the whole thing using TDD and and take a little more thought for API design.

On Typhoeus it was a bit different. There were already some speedy options out there, they just didn't work perfectly for me for one reason or another. So I built on the work that Curb had done and completely relied on libcurl for the real speed.

With any of my libraries where speed is a concern, some of the processing is done in C. It's just a fact of developing with Ruby. If you want performance you need to drop down to C. The truth is that my libraries really just piggyback on other developers' awesome code. It's just a matter of bringing it into Ruby or exposing it through another API.

There are two other things that I generally do as I write libraries. One is quantifiable and the other, not so much. I try to think about how many times certain sections of code will be run. This is the standard stuff you find in a data structures and algorithms course. Is an algorithm O(n) or n^2 or whatever. I'm not actually doing the big-oh calculation but I definitely think about if I'm executing something in an inner loop. It helps to have it in your head what the performance of a hash vs. .include? on array, if you're doing an eval, or any of the other things that might have an impact on speed.

The more quantifiable approach is to use ruby-prof for checking how long different calls are taking and where your memory is going. That library is crazy awesome.

Other than profiling, what kinds of code analysis are you doing, and what tools are you using to do it?

Paul I like to bounce ideas off of other people in the Ruby community. We have regular hackfests and I'll break out my code to see if people can point out where it sucks. I think code review is something that more people need to focus on.

I probably should be using things like flay, flog, and reek, but I haven't started yet. Another thing I really want to check out is Aman Guptu's perftools.rb. He gave a lightning talk on Google Perftools and his Ruby bindings to them at GoRuCo [ed. Aman's talk isn't posted yet, but hopefully will be soon].

Since you work TDD style on production code, which testing/mocking libraries do you use and why? Have you looked at tools like cucumber to guide your test writing?

Paul I've been using Rspec for a while now. I'm not a ninja, but it does what I need. I prefer the general style of 'describe' blocks and 'it' calls. I like the built in matchers and how the test code reads. I also used FactoryGirl for fixture data for a while, but my current work isn't backed by ActiveRecord so it's not applicable.

I looked at cucumber briefly, but I honestly don't like it. I know I'm in the minority in the Rails community right now for that opinion, but I found that I spent way too much time writing regexes and building up my test suite to work. I've talked to people that do client consulting that find value for client communication with cucumber style stories. I just don't have that need. I find that the regular test code is plenty readable for the people I work with and I don't have to spend a bunch of extra time testing. Ultimately, testing is just a means to an end. It's easy to get bogged down and spend an inordinate amount of time writing tests. I like to focus more on the implementation and what the code can actually do.

What approach do you use in working with C — Ruby-Inline, FFI, the traditional C API, or something completely different?

Paul I use the traditional C API. I should probably move over to FFI to make things play nicely with JRuby and Rubinius, but I haven't gotten to that yet.

What approaches to concurrency/parallelism are you finding most useful?

Paul I think the reactor pattern is really good. It's what EventMachine and libcurl multi use to fake parallelism. I find that's much easier than trying to deal with managing a thread pool and running IO through that. It's fine to run single threaded since we're using multiple processes anyway. You'll get a chance to peg your CPU even with multiple cores.

Are there other patterns (not necessarily concurrency related) that you find especially powerful/useful in Ruby?

Paul Ruby lends itself well to creating DSLs. ActiveRecord, DataMapper, and many other libraries have great little DSLs for building up classes with complex behavior with fewer lines of code. I copied those styles when creating SAXMachine and Typhoeus.

The other thing that I really like is the use of proxy objects for lazy evaluation. I used this technique in Typhoeus to avoid making HTTP calls until absolutely necessary. When you make a call it actually puts it on hold and gives you a proxy object. The remote HTTP call isn't actually made until you access something on the proxy. That style made it much easier to hide the details of gathering a group of HTTP calls before calling out to libcurl-multi.

Earlier, you mentioned that you are working on a book. When should we be looking for it to hit the shelves?

Paul I'm really early on in writing the book so it won't be out for a while. The pre-release chapters will be put on Safari Bookshelf as I finish them. The first bit should be there in the next month or so. The final version of the book probably won't be in dead tree form until March of 2010.

Click here to Tweet this

Posted by gnupate 0 comments
Labels: ,

Tuesday, May 12, 2009

On Ruby Interview with Pat Eyler

For a bit of a change-up, Sean Carley (@milythael) is running this interview.

Pat has become a well known online author in the Ruby community with frequent book reviews, interviews and post on various useful topics. When he asked the community who we would like to see interviewed, I turned the tables on him.


You've become a well known blogger in the Ruby community. You're active in organizing the MountainWest RubyConf and have started or helped start multiple Ruby brigades, including Seattle.rb, as well as other programming user goups. What makes you stay so outwardly involved in the software development community?

Pat Long before I was involved in the Ruby community, I'd become involved in the Free Software community. Before that, I was involved in other groups that had cultures of community involvement. Joining a community and then working to improve it had become sort of second nature too me. (It's a case of enlightened self interest though, not altruism.)

Some people give back to the Ruby community by creating cool new libraries (or improving the existing ones). Others by taking new programmers under their wings and helping them develop Ruby skills. I don't really have the programming chops to do that, so I found a niche as a community hacker.

Beyond the general value I derive from a better, stronger, friendlier community, there are also some more specific benefits that accrue. Since I hang out on a bunch of .rb mailing lists, I hear about things that are happening. I've been invited to meet with groups in places I travel to. There's also a sort of vicarious sense of accomplishment when I see a group I've been involved with do something really cool.

You were one of the co-founders of the Seattle Ruby Brigade. What was it like when the group first started?

Pat Well, the skit that Aaron Patterson did in the Seattle.rb presentation at RubyConf last year wasn't completely accurate — it is pretty fun, and not too far off though.

When things actually started, it was Ryan Davis, Doug Beaver, and me. We met at a little cafe the first couple of times, then moved to Seattle Pacific University's library. I think that's where Eric came into the picture.

At first we did a lot of talking. We started hacking together on a Ruby mail application, but it went nowhere fast. I remember meetings where we talked about the ruby debugger, ten cool libraries, and this cool window manager someone was writing in Ruby. There's just an incredible amount of Ruby talent in Seattle.

Having RubyConf in Seattle as we were starting really gave us a jump start. That's where I met Phil Thomson and talked with him about getting the pdx.rb rolling. It also got me thinking about doing something locally. We tried it with 'Ruby in the Rainforest'. A local min-conference we held over on the Olympic Peninsula — I think we had 6 people there, but it was fun.

What did the Seattle.rb do right and what made it so successful?

Pat I think some of the biggest things that the Seattle.rb does right are: they meet regularly, even if not everyone can make it; they combine hacking, learning, and socializing; and they advertise heavily. When you combine these three factors, you find that the ruby community knows that there will be a meeting and that it's going to be a good time.

Of course, it doesn't hurt that they've had some brilliant hackers there over the years. I mean, who wouldn't want to get and hear Evan talk about rubinius, hack with Eric and Ryan, or just hang out with some of the smartest Rubyists anywhere.

What other Ruby groups have you helped start and how have they done?

Pat I've been involved to some degree or another with a lot of groups. Most of them have taken off and flourished and I no longer have any role with them other than hanging out on their mailing list and wishing I could make a meeting. The group I'm most involved with these days is the URUG (Utah Ruby Users Group). This is really an umbrella group and includes the the Logan.rb, the Layton.rb, the SLC.rb, and the UtahValley.rb. I used to attend the UtahValley.rb most months. These days, I'm limited to making the occasional hacking lunch. :(

What drew you to Ruby in the first place?

Pat I was consulting at Fidelity and had just finished a fairly large Y2K monitoring app in Perl. Looking at my next opportunities, I decided I needed to become a better Perl programmer by learning another language. Since I got the idea from 'The Pragmatic Programmer', and since Dave and Andy had just released the Pick-axe, it seemed like a good fit.

For a while it seemed to work. I tested my Perl code better, it even became more legible ... Pretty soon though, I found myself not enjoying Perl anymore. I wanted to write things in Ruby instead. I'm sure the other guys I worked with got tired of hearing me praise Ruby, but it just fit the way I think. It seems like it fits the way a lot of other folks think too, Matz did a pretty good job with it.

Do you think of Ruby as a full fledged development language or do you think of it mainly as a web and scripting language?

Pat I actually do very little web programming, so I think of it more as a programming and scripting language that some people use for web stuff.

For me, things kind of sit on a continuum. At one end, there's BASH. I do a lot of quick, one-off stuff here, but I don't like to write anything over 20 or 30 lines in it since it's harder to maintain. On the other end is some kind of fast, compiled language like C. I only go here when I really need the speed, otherwise the pain just isn't worth it. Ruby sits in the middle and is where I'd like to do most of my work.

I know you actively learn other languages than Ruby. What languages currently have your interest and what do you like about them? Do you see other languages competing with Ruby or do you learn them to fill other gaps?

Pat I try to look at languages to learn ideas or approaches that I can use, but I'm also on the lookout for several things:

  • A fast, compiled language that's near-C speed but friendlier and safer. I'm currently looking at OCaml as a possibility here. Haskell is interesting too because it sort of sits on the border between Ruby and C in terms of how I could see myself using it.
  • Another language that fits my mind even better than Ruby. I don't know that I'll find one, but you never know — so far, I've found several that don't.
  • A good concurrent language. There are a bunch of options out there Erlang, Reia, Scala, maybe OCaml ... I need to learn more and do more with them to figure out where I should be looking hardest.

How do you think working as a system administrator changes your perspective on programming languages?

Pat I don't know that it changes my perspective on languages as much as my goals in programming. I'm a lot more focused on little tools that help me do my job. (Which is probably one reason I don't do much web programming.)

On the other hand, I spend a lot more time with programmers than a lot of other sys admins. I think my exposure to the Ruby community has really helped me there. I understand it when the developers dive into 'Agile mode' and start talking about sprints, unit tests, coverage, and what not. I think that makes me a better infrastructure guy.

It's also kind of fun to work on 'real' programming more as a hobby than a job. If I don't grok something right away, I don't mind filing it away to come back to later since I don't have a job or a deadline looming. I guess this could become a handicap at times too, since I don't always have an incentive to buckle down and figure something out.

How did you get started in technical writing?

Pat I used to do a lot of training, and the writing is just sort of an outgrowth of that. The first article I wrote was about functions and aliases in shell scripting. I just got tired of people asking me about it, so I wrote up an article. I was floored to find that I enjoyed writing it (I used to hate writing in school). And things sort of took off from there.

After writing some stuff for free, I landed some more 'professional' writing gigs. I ended up writing some magazine articles, a few tutorials, even a book (no, it didn't sell very well). Today, I'm pretty happy writing for my blog. Who knows where writing will take me next month or next year though.

Click here to Tweet this article

Posted by gnupate 2 comments
Subscribe to: Comments (Atom)
 

AltStyle によって変換されたページ (->オリジナル) /