Episodes
Audio
Chapters (AI generated)
Speakers
Transcript
NYT Graphics and D3 with Mike Bostock and Shan Carter
And we have two fantastic guests today from New York Times, Mike Bostock and Sean Carter. Hi, guys. How are you? Hey, it's good to be here.
Enrico BertiniAnd we have two fantastic guests today from New York Times, Mike Bostock and Sean Carter. Hi, guys. How are you?
Mike BostockHey, it's good to be here.
Shan CarterHi, it's great to be here.
Moritz StefanerHey, Mike. Heyshon.
Enrico BertiniYeah. I let you guys introduce yourself. Mike, you want to start?
The New York Times Graphic Design Team AI generated chapter summary:
Sean Carter and Mike Bostock are graphics editors at the New York Times. Primarily they do interactive graphics for the website. Most of their training has been on the job training.
Enrico BertiniYeah. I let you guys introduce yourself. Mike, you want to start?
Mike BostockSure. I'm Mike Bostock. I'm a graphics editor at the New York Times. What else do you want to know? I created this D3 visualization toolkit.
Shan CarterI'm Sean Carter. I'm also a graphics editor of the New York Times. I've been here since about 2006, and before that I worked at several other papers in California and basically worked in newspapers ever since college.
Moritz StefanerSo, Sean, you have mostly a journalism background, or do you have like, also design or computer science in the mix?
Shan CarterYeah, my mom was a graphic designer, so growing up, I was kind of exposed to that. And then in school I studied economics and worked at our college paper. But most of my journalism training has kind of been on the job training. I don't have a degree in it or anything.
Enrico BertiniSo you guys are mostly. So you are the people at New York Times who are mostly dealing with the interactive part, right? So you're mostly developing the visualizations that go on the website, right? Is that correct?
Mike BostockThat's right. Primarily we do interactive graphics. I mean, Sean and I are both based in San Francisco, whereas the rest of the department is obviously in Manhattan. And we tend to focus more on the interactives for the website.
How Does a Graphics Team Work? AI generated chapter summary:
The graphics department is close to 30 people, and we do everything from, like, basically almost every map or chart or explanatory diagram that you see in the printed paper. We work on a variety of different graphics, like on different timescales. Most of the projects start with some sort of, uh, either a news event or an article.
Moritz StefanerCan you tell us a bit more about the team? I mean, not everybody might be familiar with how huge and unique the direct graphics team is.
Shan CarterYeah, the graphics department is close to 30 people, and we do everything from, like, basically almost every map or chart or explanatory diagram that you see in the printed paper. And online is our responsibility. And I'd say we split about maybe a third of the people do kind of online mostly. About a third of the people do half and half. And about a third of the people do mostly just print work.
Moritz StefanerInteresting. Yeah. And what's roughly, I mean, quantity is not the primary measure, of course, but roughly. So people have an idea. Like, how many graphics do you produce? Like a day or a week or.
Mike BostockWe work on a variety of different graphics, like on different timescales. You know, we'll work on some that are daily graphics, some that might take a week or two weeks, others that might take a month. I think for me personally, I'm most happy working on a graphic that takes about two weeks, maybe three weeks at most, because it's enough time to really do something interesting, but it's not so much time that you get sick of it and want to work on something else. So we work on a mix of those things. I mean, I think by nature of Sean and I working out of San Francisco, we tend to do a little bit more longer term graphics, by which I mean that sort of two to four week cycle, rather than the rest of the department, but we do a mix of things.
Enrico BertiniAnd how does it work? Do you normally. So are you free to get to come up with your own ideas, or you get some instructions from new York, and then you work on some ideas that they suggest, or how does it work? How does a new project start?
Shan CarterYeah, I'd say, like, most of the projects kind of start with some sort of, uh, either a news event or an article that a reporter's working on. Um, but from that, I. It's usually kind of like more of a prompt than an assignment. And so it's kind of like, here's this story we're working on. Um, we think it has a great opportunity for some sort of, you know, statistical analysis or map or diagram. And then from there, our department kind of like, does independent research, reporting and writing, and, you know, builds the graphic up from scratch. Not to say that there aren't ideas that we, that, like, any one person could come up with and just say, hey, I think this would be an interesting thing, you know, for the paper, and we can easily pitch stuff, but most of the things are kind of in response to, like, the news of the day or upcoming news events.
Moritz StefanerSo usually there's also an article planned or in the works, and you sort of dock onto that and take the same theme and work with that.
Shan CarterYeah, exactly. Like, for instance, elections is pretty free form. We just know that the elections are happening. So we brainstorm a lot of ideas ourselves.
Moritz StefanerThey are sort of predictable, like, schedule wise as well.
Mike BostockRight. When you think ahead of time, you do some preparation for it, whereas obviously, for breaking news, you have no idea it's coming.
Moritz StefanerExactly.
Enrico BertiniSo normally, part of your work includes also finding the data that you need to create a visualization out of it. Or normally, you get some data already from the persons who have.
Mike BostockIt's a mix of both. It depends on the topic. I mean, sometimes you have a researcher that's doing research in this particular field, and they have a data set, and they think that it would make an interesting graphic, and so they're providing the data to us for us to work on, whereas in other cases, we might be doing a visualization of data that's publicly accessible, or it's just simply historical data, like the swing states graphic. I mean, there's polling data that you can look up or census data, things like that.
Shan CarterYeah, occasionally we will collect data ourselves, like send someone down to New Orleans to visit every house on a block and gather some sort of sample of how people are doing reconstruction or something like that. Much more of the data is like stuff we're finding from researchers or, like Mike said, from publicly accessible sources or from companies that collect data, like Netflix, for instance.
Enrico BertiniOkay, sure.
Shan CarterBut it is our. Like, the people in our department are generally, more often than not, the ones, you know, getting that data or finding the data sources. And I'd say that's almost half the job. It's like cleaning the data. Cleaning it up.
Mike BostockYeah. Even when data is provided to us, there's usually a ton of work to actually get it into a usable format.
Enrico BertiniSure, of course. I can imagine that. Yeah, sure.
Does ArcGIS Code Include Proof-of-Data? AI generated chapter summary:
There's a lot of work that goes into getting the data for a particular graphic. Is there also like fact checking for the visualizations? Ultimately, it's our own responsibility to make sure that the data is correct.
Mike BostockLike, I had an interesting experience on this map that we did recently of illegal immigration across the US Mexico border. And the story was about the border Patrol agents that had increased in staffing. And so we had data that was the number of agents per border patrol sector. And so in order to create a map of this, we needed to know what the boundaries were for each of the border patrol sectors. But just looking that up was extremely difficult. And we had contacts that we were working with that didn't provide us with the data. And so I ended up doing some searching. I found this PDF file that had the map embedded in it. And I knew that D3 supported the projection that was required to create this map. And so I spent a couple hours trying to figure out exactly what the projection parameters were. I eventually discovered that this map had been distorted, like stretched vertically by 12%, which is why none of my projections lined up. But then eventually I found, through sort of a lucky Google query, a server on a mil domain, like an ArcGIS server, that was returning the cartographic boundaries for each patrol sector in some ESRI format. It had this totally inscrutable query interface. Anyway, eventually got it to spit out a separate boundary file for each of these sectors that I was then able to write a custom script to turn that into standard geojson format and then finally make the map out of it. Sometimes there's a lot of work that just goes into getting the data for a particular graphic.
Enrico BertiniYeah. Yeah. That's the reason why I was asking that. It looks like that's almost always the case. Right. So people working in visualization have to spend a lot of time dealing with data.
Shan CarterYeah, for sure.
Moritz StefanerHow do you deal with, like once the visualization is done with QA, is there also like fact checking? I mean, I know you have quite thorough fact checking for the articles. Do you also have that for the visualizations? So will people click through and like, you know, look at some of the tooltips and look up the numbers if they are correct? Stuff like that?
Shan CarterYeah, internally we'll do that. We don't do it as much. It's kind of like spot checking, but making sure that all your programming is correct. If you have confidence in your original data set, you know, like you're not making an error in it. So we do kind of spot check that, like internally it does seem a little bit different than an article where you're rewriting stuff. Or hopefully when you're doing a data visualization, you're not having false presentation of the core data that you have.
Moritz StefanerSure.
Shan CarterBut in terms of the data collection, we also vet our sources and make sure that the data set that we've been given is properly collected and ask them about their, their techniques and collecting it and making sure that there's nothing wrong or misleading in the core data.
Mike BostockUltimately, it's our own responsibility to make sure that the data is correct. We don't rely on the copy desk to find errors in our data. We have to do that ourselves. But one of the things that I like to do should sort of ease my sanity is to have a fairly well automated process for going from the raw data from the primary source to the data that actually gets incorporated in the graphic. And that way you can actually easily inspect that process to make sure that you didn't. Because the most likely cause of errors is that you hand edited something and you made like a slight mistake. Like you have, like sublime text has this great feature where you can have multiple cursors open, right? So you're like editing like a thousand things at the same time. But then you didn't realize that there was one mismatch of the query that you used, like in some off screen cursor. And so you like edit, oh, it.
Moritz StefanerDidn'T wrap around or something like that.
Mike BostockSo if I have a make file instead that documents or that automates this process that goes like directly from whatever the raw data is that we downloaded from the primary source, like from the census or whatever was shipped to us from the researcher, then it's much easier to inspect that process for correctness than it is to follow every individual fact in the dataset.
Moritz StefanerYeah. And you can reproduce the graphic once the data changes, stuff like that.
Mike BostockYeah, exactly.
Moritz StefanerYeah. There was a great question from Twitter which really fits that theme from Kathryn Mulbranden. She does visualizing economics, and she asks, when posting a graphic online, should you also post the dataset used in it? Why or why not? If not, I think that's a really intriguing question. Do you think the New York Times should also make the raw data available that was used for the graphic?
Should The New York Times Post the Data Used in a Graphic? AI generated chapter summary:
When posting a graphic online, should you also post the dataset used in it? Why or why not? Do you think the New York Times should also make the raw data available that was used for the graphic?
Moritz StefanerYeah. There was a great question from Twitter which really fits that theme from Kathryn Mulbranden. She does visualizing economics, and she asks, when posting a graphic online, should you also post the dataset used in it? Why or why not? If not, I think that's a really intriguing question. Do you think the New York Times should also make the raw data available that was used for the graphic?
Shan CarterI know that some, I think the WNYC folks, like John Keefe's team, has a little link at the bottom so you can sort of download the dataset.
Moritz StefanerOkay.
Shan CarterWhich I think it's a cool idea. We've never really discussed it or had a lot of demand for that. My thinking is that if that is useful to you, it's fairly easy to extract it from the source of the page. And in some cases, we don't actually have the rights to redistribute the source data in that way. Like if we've purchased a dataset from AP or something, or this researcher has agreed to let us visualize it, but not necessarily anybody else. And then, I know, Mike, we kind of point, try to point people towards where we got the data, the original sources, and that's usually like a more complete data set as well.
Mike BostockI mean, my personal philosophy is that we shouldn't be responsible for republishing data because generally there's a whole bunch of transformation and aggregation that goes into the data that we use for the final graphic that makes it different than the data that is published by the primary source. And so if you want to reuse that data set, then there may be certain assumptions that are made in the original data set or that are a side effect of this transformation process that may not be clear to you. So generally it's safer to go to the primary source and make sure that you understand the primary data set if you're going to make a graphic on it.
Moritz StefanerYeah. Probably would need some extra documentation as well, like how the.
Mike BostockYeah, and so it's very much a responsibility to make it clear where the data comes from. But I think we do that by linking to the primary sources rather than being the middleman and republishing the data.
Moritz StefanerYeah, but you know what I like about that thought is that it sort of says, well, already the data transformation is sort of a journalistic act. Maybe that could also be of value or that.
Mike BostockRight. So maybe we should publish.
Moritz StefanerPeople might want to use that output. Maybe. Yeah. I found, who knows but it's interesting that you say you didn't have much request for that, so people didn't really come to you until, I don't know, the swing states graphic or so could I use the same data set and do you have that or. That's not a common request.
Shan CarterYeah, I guess it's not something we really figured out internally. And I know that I have seen some people just take those data sets and revisualize it as kind of experiments and just cool to see in some cases. Like I said, I'm not sure that we have the rights to kind of like make that public for everybody, but it is an extra level of work on top and I think it's like, admiral, I think it's cool, but I haven't seen a ton of demand personally for it, so we haven't taken the extra step to do that each time.
Mike BostockYeah, I mean, I think the thing I would prefer to do is sort of document our process, like in the form of a tutorial or an example or to like share. Here's like, for example, I made this let's make a map tutorial that shows how we use data from natural Earth to make maps or to take data from the us national atlas. And so that way you're sort of describing a general process for getting access to data rather than just providing access to this one specific data set that we used by that graphic.
Moritz StefanerYeah, and I mean, Sean, you're right in the end. I mean, inspect source and you're good to go anyways if you're technically inclined. And.
Mike BostockYeah, yeah.
Shan CarterMost of our, most of our data sets we load as TSV files.
Moritz StefanerShouldn't be too hard to figure out how to. Actually.
Shan CarterI'm not encouraging. No, but I mean, it is with publicly accessible data. I think that's great that people could use that to make their own stuff. I love seeing people kind of remix our graphics. I think it's really cool.
Moritz StefanerYeah. About the style of work. I mean, you started off with a few really nice, very deep and exploratory visualizations. Like, for instance, the 512th past to the White House, which showed a totally new way of thinking about how the course of statewide elections could influence the end result or what options certain outcomes leave, like a decision tree type approach or the swing states. Like a long. How would you call that type of diagram? Is it like a stack stream thing?
Mike BostockDiagram. But it's not really flow. I mean, it's really just a line chart. I guess it's just.
Moritz StefanerBut it also has different widths.
Mike BostockRight.
Moritz StefanerSo, yeah, but it was very interesting to see how different swing states changed their voting results over the years. And it's very deep data and very exploratory. And I was really excited about this whole, you know, I love this exploratory approaches anyway, so I was excited to see that in the times. And now a few of the newer graphics I found were a bit more handmade and a bit more, let's say, stylized and to the point, like there was a map on Asian countries, how they perform GDP wise and how they change, or the map of the Oscar contenders, like how all these different actors play together. So is that a general trend? Are you now, at this point in time, looking more on a few more, let's say, more handcrafted graphics, or will you go back to deep data as well? Or what's your feeling?
Mike BostockI think it's simply harder to do a good exploratory graphic, and that is because an exploratory graphic, well, it sort of implies that you're doing exploration, which means that you're doing some amount of work to extract the insight from that graphic. Whereas if you have an explanatory or expository graphic, you're sort of presenting upfront, like, here's what the conclusion is, or here's what the interesting insight is from this data. And I think even in the case of the graphics that you're describing as exploratory, our goal is to give an overview that presents something interesting, like with the 512 paths. Just by looking at the beginning, you can see the importance of winning Florida and Ohio compared to some of the states with fewer electoral votes. So there's like a conclusion that you can see even if you don't click on it, even if you don't just notice that it's an interactive graphic. But the reason why exploratory is hard is because trying to find that balance where you have some initial insight that you want to show in the overview, but there's also sort of a really rich data set that facilitates further insight. If you play with it, I mean, it's just something hard. Not every data set has that level of depth to it. So sometimes you just want to show those initial insights as quickly as possible and not make people work for it. So I think we would always love to do more exploratory data sets. It's just a question of finding the right opportunity to make them.
Shan CarterI think the elections are kind of a unique opportunity for those kind of exploratory data graphics. One, because people are kind of inherently pretty interested in that data before they even see your graphic. So it's something that the whole country is interested in up front, and also it's something that we've done several years in the past. So we have a lot of accumulated knowledge, ideas and insights about what works and what doesn't. So it's a lot easier for us to kind of come up with new, new ideas and kind of strike that balance that Mike is talking about because we're so familiar with the dataset and we've done it, you know, year after or every two years, basically.
Mike BostockYeah, like with the 512. Whereas, like, you know, people have their own hunches as to how these particular states are going to go, either because they live there or they have relatives or friends that live there. You know, they have their own particular context that they want to incorporate into that visualization, and so that encourages them to explore it.
Moritz StefanerYeah.
Does Squarespace Have Any Analytics for Our Graphics? AI generated chapter summary:
Do you have an idea of how much these more exploratory, interactive visualizations are used by your readers? Are they actually really engaging with this kind of more interactive visualization? Maybe in the future, we'll be setting up more custom analytics for our graphics.
Enrico BertiniSo do you guys have any, do you guys have an idea of how much these more exploratory, interactive visualizations are used by your readers? Are they actually really engaging with this kind of more interactive visualizations? Do you have any, do you ever try to see, I don't know, maybe logging what they do or even having anecdotal evidence of what they do?
Mike BostockYeah, that's a great question. It's something that I have been wanting to work on for a while, but I haven't really gotten around to it yet. I mean, I would say there are a lot of challenges in terms of doing analytics on our graphics because every graphic is different. Right? Some graphics aren't exploratory. So measuring how much time people spend exploring a graphic and then trying to compare two different graphics, one of which is exploratory and one of which isn't, you know, it's comparing apples to oranges. On the other hand, there are lots of things that we could be drawing from data that we're not really collecting right now. So that is one of the areas that I've been thinking of doing some more infrastructure work because I do have some background in that. For example, the project Cube that I did for square was really about doing event logging and analysis of those. And so maybe in the future, we'll be setting up sort of more custom analytics for our graphics so that we can understand how people interact with them and try to quantify some of their success. Yeah, I think.
How Much Feedback Do You Get From Your Readers? AI generated chapter summary:
When we publish a graphic, we tend to see some sort of reaction, whether it's positive or negative. The kind of feedback that I generally really spend time looking for is if something's hard to use in terms of the UI. Our goal is to educate people on something, not just impress them.
Enrico BertiniWhich introduces actually another more generic question that I have. So how much feedback do you get from, from your readers? I mean, do you know how people interpret the stuff you do, do you know how able they are to read the stuff that you produce? So do you get any kind of feedback or is just you. You put it there and then you. You hope that it's gonna be the best? I mean.
Shan CarterWe do. I mean, I feel like in this day and age, it's easy to see feedback on stuff on Twitter or like, oh, yeah, through comments or whatnot. And so I think pretty quickly, when we publish a graphic, we tend to see, like, some sort of reaction, whether it's positive or negative. We do have kind of feedback. Sometimes people will actually email us. We have a small feedback button on each graphic. And some graphics actually are sort of like comment based, where people are leaving comments. I'm sure some of the graphics are kind of complicated, and people don't spend the time to engage with them or find it too hard, too difficult to understand or something, but we don't get a lot of negative feedback along those lines. And maybe those people are out there and they're just not taking the time to feedback. But the kind of feedback that I generally really spend time looking for is if something's hard to use in terms of the UI. I feel like time spent learning how to use a graphic is kind of time wasted, whereas time learning, time spent understanding a graphic. We're using these complicated chart forms because they allow a more complex understanding of the data set. And so I feel like that's worthwhile time. So if people are confused as to how to use a graphic, or the buttons don't make sense to them, or there's too many buttons, or they didn't understand that the slider did something, those are the sorts of reactions that I'm really keen and really look for. But the kind of, like, I don't know, we don't get a ton of feedback in terms of what's too complicated to see. I mean, we keep that in mind as we're developing the graphic to try to make it understandable to the highest number of people, but also to have the level of insight that we're looking to show with the data.
Moritz StefanerYeah.
Mike BostockOne bit of feedback that I think is very interesting to observe is whether or not people discover certain interactions or certain modes of interaction. So, like, with the 512 paths graphic, for example, we got a lot of people that said they liked it, but then they either asked for something or they made some sort of comment. They made it clear that they didn't realize that it was interactive, which was interesting because, and I think that might have been sort of somewhat an artifact of the design. Right. Like you have this big data graphic in the middle of the page, and then we had these buttons across the top that were white with a little gray outline against the white background. So they weren't sort of very visually salient. And people are trained to sort of immediately leap past all the ads and other distracting content to, like, the big data graphic in the center. And so maybe those people then didn't discover the other interactive components around it. And so that's something that they tell you explicitly because they don't even realize it themselves. Right. It's just sort of something that you have to distract by hearing how they talk about the graphic.
Moritz StefanerYeah. And I think that's very a good observation that often the more specific, specifically the people tweet about it, the more you understand that they actually went deep and found, like, all individual insights. And if they just write, wow, awesome new thing, you're never really sure how long they look.
Shan CarterYeah, exactly. The best feedback is when they're like, when they respond to some insight rather than saying, oh, wow. That's ultimately our goal is to educate people on something, not just impress them with some technical features.
Mike BostockSometimes I wonder when I tweet something and then I see that it's retweeted immediately and it's like, well, how could you? It's taken at least a few minutes to understand it, but it's like they're already retweeting it.
Moritz StefanerYeah. It means people trust you.
Mike BostockYeah.
Moritz StefanerBut it's very interesting because, I mean, you do try to, I think, in each new graphic, find a new sort of visual form or try out stuff. And it's really interesting which type of people respond to what or how broadly these things are understood or not. It's hard to find out.
Mike BostockI think that, yeah, I mean, you're touching on a really interesting point, which is trying to come up with language that is familiar but also sort of expanding our vocabulary so that we can convey these things in the most effective way. Like, one of the things that Jeff Heer and I, my advisor at Stanford talked about, it was like this. What is this grammar of interactive graphics? It's like we have a grammar of graphics that describes static graphics. We tend to be fairly familiar with sort of the standard chart typology of bar charts and pie charts and area charts and things like that. But I don't know if we quite have the same sort of established vocabulary of modes of interaction. Right. Like we have the ability to do brushing or panning and zooming. And there are some standard things that we have. But at the same time, it seems like a lot of these graphics, you kind of want to come up with something new that's tailored specifically to what it is that you're trying to show. And then the challenge is, if you make it custom to this graphic, how do you still make it recognizable at the same time so that it's intuitive?
Enrico BertiniSure.
Shan CarterSure.
Enrico BertiniI think, yeah. I think you might be familiar with some of the research that has been done in the past in this area, but I agree with you. There's not really kind of a grammar that formally defines what kind of interactions you can have. That's definitely really interesting, but probably more complicated. Much more complicated.
Moritz StefanerYeah. But also visual forms. And I really like how you also try and expand the vocabulary in this area.
Mike BostockYeah. So it's like we have this trade off between, do I want to use the native button, which everybody knows what it looks like but is kind of maybe a little bit ugly or doesn't sort of fit in with the rest of our flow, you know? Or like, we did the NFL draft last week, and there was a debate that we had of the slider. It's like, are we going to use the native slider, or are we going to write a custom slider that looks better and is sort of more integrated with the design?
Enrico BertiniYeah, but this is something I wanted to ask you guys. So how do you decide where is the boundaries between something that is familiar but it's not really working well for your purposes or something that is a little bit less familiar, but it's just perfect. So how do you set the boundaries between these two things?
Shan CarterI think, like, a lot of our, like, when we're doing something, like, kind of, like, weird or new, our goal is not to, like, create something novel to catch people's attention. Generally, it's more that we're trying to kind of, like, as simply as we can convey the essence of the core truth that we're trying to convey in the graphic. So with that 512 pass, for instance, that kind of novel visual form came out of my desire to show you all the past available. Right. So it's kind of in response to some question or some idea that we're trying to convey. And so I think if that's your goal, then I don't think. I mean, that seems to me like a very, like, I don't know, like, a fairly. Everybody would enjoy, like, something that is easier to understand or, like, conveys more information in a smaller amount of time or is just, like, better at conveying something more complicated. So that's our goal. Our goal is not necessarily primarily to create something that's, like, new or different. And so I think in kind of aiming your sights at that, it helps you kind of orient in that kind of weird space where you're doing something nobody has seen before.
The Design Process of the Presidential Graphic AI generated chapter summary:
Mike: Did you start with something fairly standard and then develop more into this very unique representation. From the first, it's 100% game theory, by the way. The first version was remarkably similar to the final version that we published. But there was a critical thing missing.
Moritz StefanerAnd how do you like, what would be a typical workflow? Maybe let's stick with the 512 paths. Did you start with something fairly standard and then develop more into this very unique representation, or did you have the three idea from the beginning and then just, let's say, tweak the details? How did that come about?
Shan CarterYeah, I kind of, like, we started just kind of building it and the early versions of it.
Moritz StefanerOr would you first, like, scribble?
Shan CarterYeah, this one, there was no, like, papers. Well, there, Mike, you did some paper sketches. That's farther along.
Mike BostockI mean, I think generally we start prototyping with code because we're dealing with data, and you don't know what the graphic is going to look like unless you're using real data. And to do a hand sketch with real data is extraordinarily tedious, which sort of negates the benefit of doing it with pencil and paper in the first place.
Moritz StefanerUnless you're Stephanie Posavec, then you can pull it off, then you can.
Mike BostockMaybe. But generally, I like we start by doing these very minimalist prototypes where we sort of try to cut as many features as possible to just get the raw data graphic on the screen to see what it looks like. And then we can use that to explore the design space fairly quickly and do a whole bunch of iterations and try different branches. And then once we sort of get a sense of what parts of it are working well and what parts need to be improved, you know, we start doing sort of less drastic iterations. We're converging towards what our final design is, and then we can start to add all of those, you know, extra bells or not really bells and whistles, but, like, the elements that are sort of more tedious to implement but that are critical for user understanding. You know, things like abstract.
Moritz StefanerIt's a lot like a sculpting approach that you would make the core or the core features first and then progress towards the beginning.
Mike BostockSo, like, for the 512 paths graphic, I mean, Sean gave this talk at a visualized conference in New York City, and you can see a whole bunch of iterations that he went through. But the funny thing for me, looking back at all of the hundreds of versions that that graphic went through, is that the first version ended up being remarkably similar to the final version that we published, even though we went through a whole bunch of different variations in between. But there was a critical thing missing. So the first version was, again, a tree layout, where you're sort of looking at the entire space of possible outcomes, but then you're truncating the paths when the decision is made, when the threshold crosses 270 electoral votes or 269 for a tie.
Moritz StefanerBecause it doesn't matter anymore.
Mike BostockExactly. It doesn't matter if Obama wins Florida and Ohio. Nothing else matters beyond that. Right. Because he already has more than 270 electoral votes. But the key thing that was missing.
Moritz StefanerFrom the first, it's 100% game theory, by the way. It's like, you know, as you would do a chess tree or something like this. I love that about it. It shows that whole game development of the election. So it's kind of nice.
Mike BostockYeah, I mean, I think was successful because it did capture the overview of the entire space. Like, when people were talking about it on television, they would always throw out a couple scenarios. But the problem is you have a couple scenarios, but it's out of 512 or more. And so just poking out a couple examples doesn't give you any sense of what the overall space of possibilities are.
Shan CarterAnd actually, the first commit on that was basically just the buttons and then some text readout, because I was trying to wrap my head around exactly that problem because I was frustrated, too. I was like, oh, I've seen four different scenarios, but then I'd see other people's scenarios, and then you quickly lose track of them all. And so I was just kind of like, well, how many are there? Let's just look at it. The first one had no visual representation, and so it just kind of evolved from there.
Moritz StefanerIt's a fun example of how the spirits I view is really interesting, but only if you pre structured. Right? And, as you say, make sure you just throw the relevant stuff and leave out and prune away all the stuff that doesn't matter anymore.
Mike BostockSo, for example, that first tree layout, even though it was very similar to the final graphic that we ended up publishing, the thing that it was missing is that those decision nodes, like the Florida and Ohio example, they didn't have any more visual prominence than the other nodes that, you know, were the leaf nodes, like these very unlikely possibilities. And so the challenge there was to try to keep this simple structure, but to weight the tree so that the likelihood of the paths somehow corresponded to its visual prominence. And that's really what gave it the quick, almost like pre attentive response. Like, you just look at this and you can see, like, the important parts of the tree are more visually prominent than the less important parts. Yeah.
Shan CarterAnd I think that was a good one of, like, where we were are, there's, like, kind of a mantra we have in our department that I think Amanda, for Amanda Cox first said is like, it's easier to kind of make a thousand graphics and pick the best one than to just make the best one from the beginning, you know? So, like, we try to, like, just, like, if we have an idea, we try to try it out as quickly as possible, and then it's much easier to choose to see if it's working or not with the real data. And, like, Mike and I use D3 and kind of the web as our kind of prototyping medium. Amanda and Kevin Quailey and some others use r a lot because they find that really quick for prototyping. It's just kind of whatever. You can make a graphic as quickly as possible, like, the minimum viable graphic as quickly as possible, and not feel like you've invested too much into it so you can throw it away easily, I feel like. So whatever kind of satisfies that in terms of prototyping, that works well for us.
Enrico BertiniSo you normally don't use a lot of prepared prototyping, right?
Shan CarterNot really. I mean, we do sketch out for the more explanatory diagrams, but not for data visualization because like Mike said, it's hard to sketch out what data will look like on paper with a pencil.
Mike BostockSo I used paper sketching in the 512 paths, but that was just for the transition animation. So when you change the states, right, like, if you switched from Virginia goes Republican to Virginia goes Democrat, it does this transition where it basically regrows part of the tree that was previously pruned and then prunes a new part of the tree. And so, like, some nodes come up and some nodes come down. And it was pretty complicated and more complicated than I could just sort of think of it in my head. So I ended up writing it down with a couple of scenarios to try to figure out what the appropriate transition was between these two different states of the tree. And so in that case, I was using paper not so much to sketch out the design, but to sort of increase my working memory so that I could design the appropriate transition.
Enrico BertiniSo this is something you do for yourself in order to understand better how it should work, rather than as a way to communicate to some others how you think it would work. Yeah. Okay, good.
Shan CarterAnd we will sometimes take screenshots of our prototypes and bring it into illustrator if I want to play around with annotation placements or UI elements. So I'll then use illustrator to kind of like mock up different scenarios with those sorts of things, but generally with the, the data graphic, it's just hard to do that without the actual data. I find that you get better results by working with the real stuff.
Moritz StefanerI find these hybrid workflows really fascinating. Maybe until two years ago or so, I programmed everything, even the text labels, and it was, I don't know, maybe just some sort of code ethics or something like this. That's only real work if it's programmed. But now, the last few years, I really learned to appreciate all this, how much better it can work if you do few things by hand and sort of. Yeah, get the best of both worlds. And I mean, before the show, we discussed a bit how the, for instance, for this Asia map, like how you had this sort of hybrid workflow between coding stuff, then using these results, but then optimizing a few things by hand. So what's your take on this type of.
Mike BostockWe're really lucky, this space that we're working in and that we have these fixed data sets. So we have the opportunity to incorporate human tweaking and adjustment to improve the quality of our graphics. I mean, certainly in spaces where the dataset is not defined ahead of time, you don't have the opportunity to sort of, maybe if used mechanical Turk, you know, you can't incorporate human correction, like. But in our cases, because we know these data sets ahead of time, we can. And so then the challenge becomes sort of how, what's the most efficient process? Right? Like, you have a choice and you want to maximize quality for amount of unit time development work. So you have a choice between doing sort of the fully automated approach, which would require you to implement a potentially complex algorithm, potentially like some open research areas, versus doing things by hand and trying to find which parts of the problem are best solved by the computer and which parts of the problem are best solved by hand is really kind of the interesting challenge to producing the best graphic in the least available time. So in the case of the cartogram you mentioned, you know, we were starting with this continuous area cartogram algorithm, the Gassner Newman algorithm, and it does a distortion of geography, but it's continuous. So it's just distorting the boundaries of each of the provinces of China or these countries in Asia. And then on top of that, we're overlaying this. Hexagonal grid to produce the final sort of discrete cartogram. And the challenge was overlaying that hexagonal grid introduced a lot of error. And so we could sort of throw away the Gassner Neumann algorithm and design a new algorithm from scratch for solving this problem, which be a whole lot of work. Or we could sort of take the output of this existing sort of off the shelf algorithm and then just doing a little bit of hand correction, end up with a high quality result that didn't have any apparent error. So that's what we ended up doing. I still think it would be interesting to go back and design a discrete algorithm for that particular problem, because it ended up being quite tedious because I chose very small hexagons. If I had chosen larger hexagons, it would have been a much easier graphic to make. So sort of a classic noob mistake. So Ralph Straumann is this swiss cartographer who was one who made this hexagonal cartogram that was part of the inspiration for this graphic. And he helped me by explaining the process that he used to make the graphic. And so when I finally published it, he commented like, oh, your hexagons were really small. And I was like, I didn't know.
Making a Cartogram with a Hexagonal Grid AI generated chapter summary:
The challenge was overlaying that hexagonal grid to produce the final discrete cartogram. It ended up being quite tedious because I chose very small hexagons. It's almost all done in D3, although there is one example getting a lot of background noise.
Mike BostockWe're really lucky, this space that we're working in and that we have these fixed data sets. So we have the opportunity to incorporate human tweaking and adjustment to improve the quality of our graphics. I mean, certainly in spaces where the dataset is not defined ahead of time, you don't have the opportunity to sort of, maybe if used mechanical Turk, you know, you can't incorporate human correction, like. But in our cases, because we know these data sets ahead of time, we can. And so then the challenge becomes sort of how, what's the most efficient process? Right? Like, you have a choice and you want to maximize quality for amount of unit time development work. So you have a choice between doing sort of the fully automated approach, which would require you to implement a potentially complex algorithm, potentially like some open research areas, versus doing things by hand and trying to find which parts of the problem are best solved by the computer and which parts of the problem are best solved by hand is really kind of the interesting challenge to producing the best graphic in the least available time. So in the case of the cartogram you mentioned, you know, we were starting with this continuous area cartogram algorithm, the Gassner Newman algorithm, and it does a distortion of geography, but it's continuous. So it's just distorting the boundaries of each of the provinces of China or these countries in Asia. And then on top of that, we're overlaying this. Hexagonal grid to produce the final sort of discrete cartogram. And the challenge was overlaying that hexagonal grid introduced a lot of error. And so we could sort of throw away the Gassner Neumann algorithm and design a new algorithm from scratch for solving this problem, which be a whole lot of work. Or we could sort of take the output of this existing sort of off the shelf algorithm and then just doing a little bit of hand correction, end up with a high quality result that didn't have any apparent error. So that's what we ended up doing. I still think it would be interesting to go back and design a discrete algorithm for that particular problem, because it ended up being quite tedious because I chose very small hexagons. If I had chosen larger hexagons, it would have been a much easier graphic to make. So sort of a classic noob mistake. So Ralph Straumann is this swiss cartographer who was one who made this hexagonal cartogram that was part of the inspiration for this graphic. And he helped me by explaining the process that he used to make the graphic. And so when I finally published it, he commented like, oh, your hexagons were really small. And I was like, I didn't know.
Moritz StefanerHe could have told you before.
Shan CarterRight.
Moritz StefanerThat's interesting. And technically, what's your, like, really on this sort of file format level? Like, how, how do you, let's say, bounce results between data web processing D3 and illustrator? Do you, like, create svgs in D3, open them in illustrator, save them back, stuff like that?
Mike BostockIt's almost all done in D3, although there is one example getting a lot of background noise. That's Enrico. Okay, there we go. Thanks.
Shan CarterEnrico is somebody in the car?
The hybrid computer-human approach to graphics AI generated chapter summary:
Another interesting example of this hybrid computer human approach to graphics is the network diagram that we did on the Oscar contenders. Of course, if you have time to build your own custom tools that are really the best to make one graphic on point, I think that's really fantastic. But you also have to be ruthless about only implementing what you actually need.
Mike BostockAnother sort of interesting example of this hybrid computer human approach to graphics is the network diagram that we did on the Oscar contenders. And that started out as just a standard force directed graph layout, you know, where you just plug in the network and it automatically lays out the entire graph for you. But while that was fairly good, it really was not as good as you could do by hand. And so I was collaborating with Alicia DeSantis on making the print version of that graphic, and she ended up taking the output that we had in SVG and putting it into illustrator. And then she did a whole bunch of tweaking to that network, and then she showed me the draft for print, and I was like, wow, that looks so much better than the version that I had been making for the web. And so then I had to figure out how to incorporate those improvements back into the version of the web so that it looked better. And so the way that I did that is I ended up implementing sort of an editor interface for the graph that would sort of start with the initial output of the force directed graph layout algorithm and then allow me to just sort of move things around in the browser and then save it back out to the data file. So in a sense, it was like a very limited custom version of illustrator that I wrote in the graphic to just move the nodes around, but with.
Moritz StefanerThe constraints implemented, like what can go where or also the line drawing being automated, of course.
Mike BostockRight. So it was very specific to this particular problem, so I could just move around the actors and the sort of connecting splines and the labels and such like that.
Moritz StefanerYeah, that's a luxurious workflow. Of course, if you have time to build your own custom tools that are really the best to make one graphic on point, I think that's really fantastic.
Mike BostockWell, yeah, I mean, that is kind of the challenge is like figuring out when you're building a tool, sort of what is the minimal instantiation of the tool that gives you just what you need. Because if you sort of overgeneralize, like if you overbuild your tools, you can spend a lot of time building this thing that you're really just going to throw away at the end of the graphic. Like there's no value in the tool itself. The value is only in the output of the tool. So sort of having a foundation like tools like D3 that make it faster for you to build these things is sort of what makes it a viable approach. But you also have to be ruthless about only implementing what you actually need for the graphic.
Moritz StefanerYeah, I mean, there is always this danger that you built this swiss army knife type thing. You can do a lot of cool stuff once you start building tools and frameworks. I mean, we could move on. Talking about D3, I mean, people are surely very curious to hear what your perspective on D3 and how the whole thing developed, where you see it going. So I think that could be interesting to discuss. So we did have Jeff on the show maybe, I don't know, half a year ago. So we covered all the pre D3 stuff with him anyway. So our listeners should be familiar. Yeah, should be familiar with proto vis and, you know, all the precursors and so on. So can you give us a quick run through of the history of the three and where it is now?
D3 and the visualization field AI generated chapter summary:
Jeff: Can you give us a quick run through of the history of D3 and where it is now? The goal of Protovis was to come up with a simple language that expressed visualizations as graphical marks. D3's challenge is to try to create a language that strikes the right balance between being accessible and efficient.
Moritz StefanerYeah, I mean, there is always this danger that you built this swiss army knife type thing. You can do a lot of cool stuff once you start building tools and frameworks. I mean, we could move on. Talking about D3, I mean, people are surely very curious to hear what your perspective on D3 and how the whole thing developed, where you see it going. So I think that could be interesting to discuss. So we did have Jeff on the show maybe, I don't know, half a year ago. So we covered all the pre D3 stuff with him anyway. So our listeners should be familiar. Yeah, should be familiar with proto vis and, you know, all the precursors and so on. So can you give us a quick run through of the history of the three and where it is now?
Mike BostockSure, that takes me back a few years. So let's see, when I joined I Stanford's PhD program in computer science in 2008. Jeff, let's see. One of the first classes that they make you take is called cs 300. It's a survey class, and basically, like, every professor does one lecture to talk about their area of research. And so it's a great way to get a sense of sort of the entire space of research opportunities and computer science. And Jeff at the time was still finishing his PhD at Berkeley, but he had already gotten offered a job as a professor at Stanford. And so he came to give his CS 300 lecture on information visualization. And it was actually somewhat embarrassingly, I mean, the first time I really learned about visualization as being a research field. Prior to that, I had an interest in visualization. You know, I had done some stuff at Google, and, you know, I owned every book by Tufte, and it was something that I knew that I was interested in, but I didn't really realize that it was something that I could focus on as part of my PhD studies. So right after he gave his lecture, I was like, this is amazing. Like, this is what I want to do. Like, will you be my advisor? And so he then joined, I guess it was in the winter quarter and taught the CS 448 B data visualization class. And Protovis was my final project for that class. So there are several sort of earlier homework assignments, smaller projects, but then the culmination of the class is like a four or five week project where you implement something and then you write an academic style paper to accompany whatever it is that you made. And so that ended up being the Protovis paper that was accepted to the IEEE infovis conference. And let's see.
Moritz StefanerAnd you had the framework implemented as a prototype as well, right?
Mike BostockYeah. The prototype of Protovis, that's sort of a tongue to twister, was implemented as part of that class.
Moritz StefanerYeah. Wow. Yeah. That must have been a lot of work, right? Because, I mean, obviously, or did you, like, in the beginning, already know, like, how this thing would work, you just wrote it down or.
Mike BostockBecause writing a framework like that is. No, I mean, I had an idea very early on that I wanted to build this sort of bottom up approach to visualization, which is that I saw a lot of the existing approaches, sort of took more of this top down approach where they tried to sort of at a very high level, describe how it's sort of a language of visualization or grammar of graphics. And what I wanted to do was more sort of build it up from these visual components, because I just find it more approachable or more intuitive to think about the discrete visual elements that comprise the visualization that I'm making, rather than thinking about it more in mathematical terms of these mapping dimensions of data to visual encodings. And so the whole goal of Protovis was to try to come up with a simple language that expressed visualizations as graphical marks. Forgot what I was going to say after that.
Moritz StefanerYeah, but it's really nice to explain how visualization works, I think. So. I still use it when I teach. I still use this cheat sheet from portal wiz because it's so nice in a sense that you could say like, okay, here's different types of graphical marks you can use. They can have static properties and dynamic ones, and the dynamic ones might depend on data or they order in a list. So all these basic things are very nicely laid out there. And I think that also makes the basis for D3's logic of organization.
Mike BostockI mean, the challenge is to try to just have this very simple representation, which is a direct mapping from your data to these graphical elements, rather than something that's more abstract than that. But at the same time, it can't be so low level that it's extraordinarily tedious to work with, right? I mean, people want to make visualizations quickly, they don't want to spend any more time than they have to making these things. So having a language that strikes the right balance between sort of being accessible, easy to learn, being efficient, which is like fast to create whatever it is that you want to create, and also being expressive, like having the ability to express all of the different possible types of visualization you might want to make, those are the three main dimensions or criteria we were looking at in designing a language in terms of going from Protovis to D3. I think the real thing there that I was trying to accomplish was to make it even more expressive without giving up this efficiency. So the main limitation of Protovis was because it defined this language of graphical marks. Anything that wasn't in that predefined language, you wouldn't be able to make. So some simple examples of that.
Moritz StefanerIt was like a sandbox world, right? So you had, you had circles and rectangles and now deal with it.
Mike BostockAnd so, you know, I loved some aspects of the language, but at the same time, when I wanted to do certain things, like if I wanted, you know, stroked or dashed strokes, you know, or if I wanted a gradient or something or a clipping path or various other things that the browser was perfectly capable of doing, but Protovis could not support because it wasn't built into the language. So my options were to sort of keep incrementally adding features to the Protovis language, or to try to rethink it and somehow still capture the essence of Protovis without simultaneously incorporating the representation. And so that's really how D3 came about, is to try to basically solve a smaller problem, which is this mapping from data now not to a custom graphical language, but to an existing representation, which is the document object model. So how to capture that mapping without needing to also specify the entire graphical language? And the beauty of that is that as new functionality is added to web standards, that functionality is immediately available when you use it in D3. And you don't have to learn sort of a proprietary representation, you just learn the standard representations. And you can take advantage of all of these existing tools in this ecosystem that are all based on web standards, like your element inspector and your JavaScript console that's built into your web browser.
Moritz StefanerSo what do you say? D3 is more like a toolkit. So I always perceived it more like a toolbox of things. You can freely combine and just see what comes out of the combination rather than this more framework. Yes, this chart type and you can customize.
Mike BostockYeah, it's definitely, I definitely wouldn't call it a framework or a platform or any sort of monolithic thing. One of the ways that we talked about it was this idea of a visualization kernel, which is basically sort of what is the essence of visualization? Like what is the smallest but still central problem that you can solve, that applies to basically every situation that occurs when you have visualization. And so that's where you get this concept of mapping data to a scene graph, to the document object model. Like whenever you have a declarative representation of a graphic, be that SVG or HTML or any other sort of scene graph, you have this problem, which is you have data, right? You have some abstract, hierarchical or tabular data, and you need to somehow transform that. You need to create a scene graph from that data. Or likewise, if you're already showing a graphic and you want to show a different view, or you want to support some sort of interaction or animation, you need to transform that existing scene graph to correspond to whatever your new data is. And so that was really the core problem that we were trying to solve with D3, is to express these transformations of the document object model based on data. And that's really where you get the concept of the data join. And these three sort of enter, update and exit sub selections, they represent the three different possibilities when you're joining data to an existing scene graph. But then on top of that visualization kernel, there are all sorts of higher level primitives or abstractions or components that you want in order to make common types of visualization. And so in that sense, on top of this kernel, you have sort of an unbounded toolbox of different things that you can plug in, but the key is different layouts.
Moritz StefanerThere's a huge mapping library by now, also due to Jason Davis passion, let's say.
Mike BostockJason has done tremendous amount of work on our D3 geo module, which has just been really exciting. But yeah, I mean, the idea is that you have this sort of unbounded set of components that you can use. And the key point though is that they're all, they're modular, they're decoupled. So you have the kernel at one level, which is this very small thing that's just trying to map your data to a scene graph. And then on top of that you have these other components that you can pick and choose. And I think even though, you know, there's been some people talk about D3 having sort of a steep learning curve and having difficulty understanding what the concept is of a data join. Once you understand that concept, which admittedly is fairly foreign, but it's kind of what gives it its power. Like you've sort of mastered like everything to D3 that you need to start using it. But beyond that, you can sort of pick up these individual components as you want to increase your expressiveness and start to incorporate other standard visualization algorithms.
Moritz StefanerAnd, I mean, I think the main point is you have so many examples that for anything you want to do, there's already an example that's like 50% of it, and you can just take that and customize it until you're sort of close to what you want to get to. Yeah, I mean, one thing I found tricky, so I've been using quite a bit of these three lately, but one thing I found tricky is to scale it to really complex applications that have lots of different views and different tabs or screens and lots of options. But probably that's more finding the right patterns for yourself to implement all these conditionals and these different.
D3: The Most Popular Visualization Framework AI generated chapter summary:
D3 doesn't presuppose a particular high level design for your application. One thing I found tricky is to scale it to really complex applications. I think we still need to figure that out, like how to use it best.
Moritz StefanerAnd, I mean, I think the main point is you have so many examples that for anything you want to do, there's already an example that's like 50% of it, and you can just take that and customize it until you're sort of close to what you want to get to. Yeah, I mean, one thing I found tricky, so I've been using quite a bit of these three lately, but one thing I found tricky is to scale it to really complex applications that have lots of different views and different tabs or screens and lots of options. But probably that's more finding the right patterns for yourself to implement all these conditionals and these different.
Mike BostockYeah, I mean, D3 being the sort of low level tool that it is, this visualization kernel, it doesn't presuppose a particular high level design for your application. So if you have a complicated application, you'll need to sort of solve that problem on your own. Like D3 doesn't solve that problem for you. Now there are things that you can use and there are sort of some standard patterns and things for building these larger applications, but I haven't sort of tackled that problem myself.
Moritz StefanerYeah, yeah, but it's tricky. There's also a conceptual decision in a sense of, okay, when do I make it two or three different svgs and when do I keep it in the same svg? You know, that's hard to, you know, or these types of decisions, like at what level do I introduce distinctions and so on.
Mike BostockYeah, that can be a challenge.
Moritz StefanerI think we still need to figure that out, like how to use it best. But I think it's safe to say it's the, I think up to now it's been the most successful or most popular, let's say, visualization toolkit framework. I checked sometime in January and I saw it was like on place eight or so on GitHub, like the most starred project. So it's hugely successful.
Mike BostockIt's been pretty amazing to see how many people are using it. And just that wiki on D3 is all publicly editable and so everybody just can add their examples to that gallery.
Moritz StefanerThat's hundreds of examples. It's just amazing.
Shan CarterYeah. I mean, from my standpoint, D3 was the only visualization framework I was ever able to use. Like sort of in production. I kind of played around with even Protovis before and like some of the other ones.
Moritz StefanerSure.
Shan CarterAnd one of the things that always struck me is, like, with a lot of the other ones, you can always tell when you see an example out there of what framework was used to make it because there's always some telltale sign because it's a little like too rigid or whatever. But with D3, a lot of times it's like you'd have to inspect the source and check for the D3 object to see because you weren't sure. There are some telltale signs, but it's so expressive that you can do so much with it.
Moritz StefanerIt brings no visual defaults. I mean, maybe the axes are sort of, once you see one of these D3 axes, you might recognize them, but that's it. That's true. We have a few questions from Twitter as well. Anna Schneider. She asks, what do you think of Vega, and will it change how you use or teach D3? So probably we should explain. So Vega is a fairly new development, again from Jeff hare or his startup company Trifacta, and they do something interesting, but I'm not sure if I understand even properly what they're doing. But it's sort of an abstraction on top of these three or other frameworks of how you can describe charts and their contents.
Vega vs. D3: Will it Change How You Use AI generated chapter summary:
Vega is a strictly declarative language. Unlike D3, it's a higher level abstraction. For me, Vega is the perfect output interface for a lot of exploratory data analysis tools. Enrico: The space is still wide open.
Moritz StefanerIt brings no visual defaults. I mean, maybe the axes are sort of, once you see one of these D3 axes, you might recognize them, but that's it. That's true. We have a few questions from Twitter as well. Anna Schneider. She asks, what do you think of Vega, and will it change how you use or teach D3? So probably we should explain. So Vega is a fairly new development, again from Jeff hare or his startup company Trifacta, and they do something interesting, but I'm not sure if I understand even properly what they're doing. But it's sort of an abstraction on top of these three or other frameworks of how you can describe charts and their contents.
Mike BostockSure, I can take a crack. I mean, for full disclosure, I'm a technical advisor to Trifacta, but I'm not directly involved with this Vega project, although I've been helping Jeff a little bit with the design.
Moritz StefanerSure.
Mike BostockBasically what Vega does is it's a strictly declarative language. So unlike D3, which sort of tries to be as declarative in terms of minimizing control flow, like you don't have a lot of if statements or for loops when you write D3 code, Vega is strictly JSON, so it's only declarative. I guess there's a small exception for that in terms of the interaction handlers, but generally speaking, philosophically, it's only declarative. And then the other thing that's different about Vega is that it's actually, it is a higher level abstraction. It's higher level than D3. So it's more like grammar of graphics a little bit in that you have a definition of what your dimensions of data are and then these transformations. And so it's able to do things like provide axes for you automatically rather than you needing to sort of wire those up with scales like you would do with D3. So for me, Vega is really sort of the perfect or at least a much improved output interface for a lot of exploratory data analysis tools. So for example, if you're using r or if you're using pandas, people have already written interfaces on top of Vega. There's this project called clickme for r and something called Vincent for pandas that basically lets you generate Vega D3 visualizations from these exploratory tools. And I think.
Moritz StefanerSo would you say it's almost like a ggplot replacement or something like that?
Mike BostockI think you could write sort of ggplot two Vega to D3 if you wanted to as well. But I mean, more generally, getting back to this visualization kernel, it's always been my hope that people would build higher level abstractions on top of D3. Yes, and that way they can pick sort of a different point along the spectrum between sort of the most expressive but lowest level tool to the most efficient, highest level tool. But then you correspondingly have to give up some of your expressiveness. So you can't be as expressive in Vega as you can in D3, but at the same time your specifications can be smaller and it can provide a lot more automatic functionality for you.
Moritz StefanerYeah, it's very interesting and this idea that, yeah, it's something you can combine with other tools or establish these pipelines. I think that makes it very, very fascinating to think about that because traditionally.
Mike BostockD3 hasn't really been designed for exploratory visualization. It's really for these types of custom visualizations like we do at the New York Times. But you heard earlier, I do use it for exploratory visualization, but to me that only works. A. I'm very familiar with D3 already having written it. And we cut all of these corners in our iteration, in our early prototypes. So we're not trying to solve sort of the more tedious tasks of visualization just to get a sense of what the raw data graphic looks like. And that's what enables us to use D3 for exploratory visualization, even though on its surface it might seem only more appropriate for more custom data graphics.
Moritz StefanerYeah, and I mean, it's not for everybody to customize everything. Some people just want a quick line chart based on some data. And I think Vega bridges exactly that, that sort of gap.
Mike BostockThe other aspect that I like of using D3 for exploratory visualization is as we're doing this iterative design process and we start to converge on our final design, I don't have to throw away whatever it is that I made using the exploratory tool to finally start producing the final graphic. Right. I've got a starting platform for building that final graphic and I can just sort of. It allows me to transition very easily from the exploratory stage of the design process to more the refinement stage when we're actually making the final graphic.
Moritz StefanerYeah, that's interesting. Yeah. So I think this space is still wide open. Yeah, totally. So little update on Enrico. So they kicked him out of the room. I read this on Skype chat now and he's on the street, but we hope we can have him join sometime soon when he finds a Starbucks or something. Just for those of you who might be wondering why he's so quiet, I could also dial him in with phone maybe. We'll see. He will be back. We can have a few more questions. So Lynn Cherny, she asks, are there more D3 books in the works or more tutorials that are especially targeted towards more advanced users? So we've seen a lot of D3 getting started in D3 tutorials, I think. I mean, that makes sense too, because everybody needs to get started, but now there's also need for more advanced stuff. So are there any plans you're aware of or are you writing on something yourself?
D3: More Intermediate Tutorials AI generated chapter summary:
Are there more D3 books in the works or more tutorials that are especially targeted towards more advanced users? I think there's definitely demand, increasing demand now for more of these intermediate or advanced level tutorials.
Moritz StefanerYeah, that's interesting. Yeah. So I think this space is still wide open. Yeah, totally. So little update on Enrico. So they kicked him out of the room. I read this on Skype chat now and he's on the street, but we hope we can have him join sometime soon when he finds a Starbucks or something. Just for those of you who might be wondering why he's so quiet, I could also dial him in with phone maybe. We'll see. He will be back. We can have a few more questions. So Lynn Cherny, she asks, are there more D3 books in the works or more tutorials that are especially targeted towards more advanced users? So we've seen a lot of D3 getting started in D3 tutorials, I think. I mean, that makes sense too, because everybody needs to get started, but now there's also need for more advanced stuff. So are there any plans you're aware of or are you writing on something yourself?
Mike BostockWell, just on Friday I released a new article that I wrote that explains the internal workings of selections called how selections.
Moritz StefanerYeah, that works very informative for me. It was like, ah, groups. I had sort of this visceral feel on groups, but I couldn't name it.
Mike BostockThat's great.
Moritz StefanerSo now I can point to.
Mike BostockYeah, so the goal there was to really describe how selections are implemented rather than just how to use selections. And I think it is interesting, sort of as D3 has gotten more established and more people are using it, the demand shifts a little bit from just sort of why should I be using this tool and how do I get started to. I've already decided to use this tool, but how do I really master it and get the most out of it? And so in that case, you know, when you have a doc or an article like how selections work, you know, it takes a little bit more work, it's a longer document, it's more technical. And so if you don't already have a motivation to use D3, it might not resonate with you. But I think for the people that are using D3 already, it provides that really more detailed information that helps them get to the next level with their use of the tool.
Moritz StefanerAnd a few advanced patterns could be interesting, like this cookbook type thing where you have recurring problems and you need a smart solution to higher level patterns. Maybe.
Mike BostockYeah, definitely. I think there's definitely demand, increasing demand now for more of these intermediate or advanced level tutorials. Now I try to write tutorials, sort of like whenever I have a chance, in addition to these examples and stuff that I'm releasing. I think eventually, maybe if I have enough of these tutorials, I can somehow string them together and make them into a cohesive, comprehensive book. But just the idea of starting out trying to write a book is a little bit too daunting. I mean, it's a huge amount of work.
Moritz StefanerIt's a huge task. Yeah.
Mike BostockSo I enjoy right now, like sort of the more rapid release cycle of sort of getting a bunch of feedback, like seeing types of questions that people ask either on the mailing list or on stack overflow, or looking at talks that people are giving and trying to identify what are the issues that people are commonly running into. And then write a tutorial that tries to address those issues specifically or just a demo.
Moritz StefanerSo often things have just been cleared up with you just putting out one thing that shows how to use that specific stacking.
Mike BostockIt's so much fun for me to do that because it's not a lot of work to create a these demos, but you can just get it out there and then people are so happy to understand how things work.
D3.1 AI generated chapter summary:
D3 is focused on this visualization kernel, the core of it is pretty small. There is an opportunity for more people to write plugins, these additional layouts, or even chart types or behaviors. There are still pockets of resistance to supporting older browsers.
Moritz StefanerSo how is the community? So is it mostly you and Jason Davis? So that's what I gather from following with one eye. Or are there more people involved in the core development of the.
Mike BostockIn terms of core development, I would say that Jason and I are probably like mid nineties in terms of how much of the code that we've written, like 90% or something like that. We do get a variety of other contributions from people in terms of like pull requests to fix bugs or to add small features. Lots of people have edited the wiki to add additional documentation or to add their own examples and things like that. I think part of it is a reflection of D3's design in a way, like because it is focused on this visualization kernel, the core of it is pretty small. It's not this giant framework that needs to expand whenever you want to add a new feature to it. It's a low level thing and most of those features that you want can already be supported. It's just a question of making the right example or teaching you how to do that. But I think there is an opportunity for more people to write plugins, these additional layouts, or even chart types or behaviors or reusable components like the axis and the brush. So there is a D3 plugins repository as well, where some people have been contributing more higher level components for reuse.
Moritz StefanerYeah, makes sense in a way. It's interesting because the model seems very similar to jquery. So do you follow what's happening with jquery right now?
Mike BostockThey just have the jquery.
Moritz StefanerExactly. And they sort of have now these two versions, 1.9. X is still supporting older browsers and it seems to become very tricky right now to manage jquery just because it has been adopted so widely and you know, you run into all these issues with backwards compatibility and so on.
Mike BostockYeah, well, I took sort of a hard nosed stance early on by rejecting, you know, that is eight and below by basically saying like, I'm only going to support. This was true of Protovis as well, actually. So like in 2008, like only supporting browsers that support web standards and everyone was like, oh, that's ridiculous at the time. But now of course, you know, you win because ie nine is now like, I think just past usage of ie eight and like ie ten and they've got like auto updates set up.
Moritz StefanerBut yeah, it's still an issue. Like, you know, when I work with big corporate clients, they often have.
Mike BostockYeah, definitely. There are still pockets of resistance.
Moritz StefanerWhenever you have a central it department, you have lots of I eight still.
Mike BostockBut like, what we do in the New York Times, for example, is that we just use screenshots of our graphics as the fallback for ie eight and below. So it's not like they're not seeing anything. And in fact, the way that we design our graphics is such that the static view of the graphic is supposed to convey all of the important points. Right. So the idea of interaction is that it's just adding sort of an extra layer of depth to the graphic if you want to explore it and extract more insight from it. But you should be able to get the main points of a graphic just by looking at the static representation, which is what makes doing the screenshot a viable approach and being minimal.
Moritz StefanerIt's an identity check anyway. Anyways, like if you don't get it from a picture, probably you don't get it with a drop down as well. Yeah, that's a good point. Yeah, yeah, yeah, it's interesting. Yeah. And like talking about the future, for instance, Sam Leach and a few other people asked, what he asked specifically, where do you think or hope D3 will be in one, two or five years, even from now? Like, do you even think in these.
D3 and the future of SVG AI generated chapter summary:
Where do you think D3 will be in one, two or five years? I hope that the browser vendors continue to make improvements to their rendering and engines. There's still a lot of room for improvement in terms of performance and the quality when rendering SVG.
Moritz StefanerIt's an identity check anyway. Anyways, like if you don't get it from a picture, probably you don't get it with a drop down as well. Yeah, that's a good point. Yeah, yeah, yeah, it's interesting. Yeah. And like talking about the future, for instance, Sam Leach and a few other people asked, what he asked specifically, where do you think or hope D3 will be in one, two or five years, even from now? Like, do you even think in these.
Mike BostockDimensions really so much? Like on a day to day basis? Like, I mean, you never know what technology is going to be like in five years. There is one thing that I would really hope, though, which is not actually directly related to my sort of development plans for D3, but in terms of sort of more broadly, like, the space for web based data visualization. And that is that I hope that the browser vendors continue to make improvements to their rendering and engines. There's still a lot of room for improvement in terms of performance and the quality when rendering SVG. And SVG has this nice property that is a purely declarative representation of a scene graph. And that means it's possible to take that declarative representation and then push it all the way down to the graphics hardware. And that would greatly improve rendering performance and also even improve the quality wherever you have things like full scene anti aliasing versus getting these little small seams between adjacent polygons and things like that. And there was actually a project at Nvidia called nv path rendering where somebody implemented this. I don't know what the status of that project is, but it was super exciting when I saw him demonstrating a very complex SVG at 200 frames per second, just rotating it around and transforming it. And so I would love to see that happen, to basically improve our ability to use larger data sets and to do more rich transitions and animation without having to resort to the low level complexity of Webgl.
Moritz StefanerYeah. How do you see Webgl? Do you think in maybe two or three years every browser will support it and it's going to be like the, the default high end rendering thing on the web? Or do you think it's going to stay?
Mike BostockI don't know. I mean, initially I was very bearish because I never expected Microsoft to support it because Microsoft's got this relationship with DirectX and wanting to support that rather than OpenGL, which is the basis of WebGL. But then it turned out, I don't know if they've confirmed this, but there's been some rumor of them supporting, supporting Webgl in Ie eleven, which is very surprising to me. I mean, there's still a question of whether Apple will support Webgl on mobile Safari, which they haven't done yet. There's a question of whether they're doing that for security reasons, or whether they want to sort of hamstring your ability to write compelling apps in the web browser versus selling them on the App Store. But that's all conjecture on my part. So, I mean, my personal hope, despite Webgl being amazing, and I love that it exists, I would really hope that we still have improvements to SVG because it's so much easier to use. And so having taken better advantage of the graphics hardware while still sticking with this easy to use graphical declarative representation presentation would be really useful.
Kicking the homeless out of the asylum AI generated chapter summary:
By the way. Enrico, are you back? Yes, I'm back. I was kicked out quite rudely. Where are you now? On the street somewhere. I hope they don't kick me out again.
Moritz StefanerBy the way. Enrico? Enrico, are you back? Yes, I'm back.
Enrico BertiniI was kicked out quite rudely.
Moritz StefanerWhere are you now? On the street somewhere.
Enrico BertiniI actually managed to come back almost to where I was before.
Mike BostockI hope they don't kick me out again.
Enrico BertiniThat was the only solution I found.
Moritz StefanerThey're not going to be so friendly if they find you the second time around.
Mike BostockYeah, yeah, yeah.
Enrico BertiniI shouldn't talk too loud.
Moritz StefanerYeah, sorry, Enrico. Understood.
Enrico BertiniWas quite weird anyway.
Mike BostockYeah.
Moritz StefanerGood to have you back. Yes, sorry, do you have any more questions concerning D3?
Enrico BertiniI have tons of questions. No, I mean, yeah, it's the first time I will have the opportunity to listen to the stories actually.
Mike BostockThat's right, yeah.
D3: Is the Framework Complete? AI generated chapter summary:
Mike: Would you consider the framework to be now complete? In the rest is details. There's still, you know, that entire space of what the higher level components are that you build on top of it. And then also in terms of what I want to do with D3, there's a huge education component.
Enrico BertiniOkay, so did you go through some of the questions from Twitter or.
Moritz StefanerYeah, yeah, we got these pretty much covered. Like the past, present, future. We have it down.
Enrico BertiniOkay. Okay, good. I don't know.
Moritz StefanerSo, Mike, is there still anything. So would you consider the framework to be now complete? In the rest is details. It's not a framework. It's too good. But. Or is there still something where it's like, oh, this part is still really missing? And once I will tackle that.
Mike BostockYeah. In terms of, you know, the core of D3, in terms of, you know, transforming the document object model based on data, I think that's pretty stable. I don't expect there to be too many changes to that in the near future.
Moritz StefanerYeah, you shouldn't change that anymore.
Mike BostockBut there's still, you know, that entire space of what the higher level components are that you build on top of it. So things like the layouts and the behaviors and the, you know, SVG components, like sort of, you know, interpolators for lines and areas, things like that, the other sort of little bits that you can compose together to produce your visualization. I think that space is unbounded. Right. There's an infinite number of things that we could implement there, and so it's just a question of identifying, like, what the most useful components are and then designed convenient but flexible API to support that. So I think there will be an increasing amount of development in that space going forward. And then also in terms of what I want to do with D3, there's a huge education component as well, just making sure that people find it easy to use, understand best practices, whether that's best practices in terms of how to organize your code. Like you mentioned, you had difficulty with sort of more complex applications, like, that's something that we could try to solve. Or visualization practices as well, like what are appropriate ways to design transitions or to label your graphics. There are all sorts of interesting problems in terms of just labeling scatter plots, for example. We could provide better algorithms to solve that problem. So I think of D3 as this kernel that solves the smallest problem, and then we can identify other problems that we need to solve that then builds on top of that platform.
Moritz StefanerYeah, no, that makes a whole lot of sense. I think it's going to be super important to keep the core as clear as possible, and, as you say, explain it as clearly as possible, and then people will just advance it on themselves.
When Does My Boss Leave? AI generated chapter summary:
When does my boss leave? Well, the answer is not very often right now, because I have a nine month old daughter. Some people say that you're more productive when you are kids. You have to be very motivated to get things done quickly.
Enrico BertiniSo, Moritz, did you ask to Mike, the question from Robert. From Robert Kosara?
Moritz StefanerOh, yeah. No, I haven't yet. So you can ask it.
Enrico BertiniSo Robert Kosara is asking, when does my boss leave?
Mike BostockWell, the answer is not very often right now, because I have a nine month old daughter.
Enrico BertiniOh, well, so we know something about that. Me and Moritz.
Mike BostockYeah, yeah, yeah, yeah.
Moritz StefanerBut that makes it even more crazy.
Mike BostockYeah, yeah, yeah.
Enrico BertiniSome people say that you're more productive when you are kids. I still have to understand whether it's true or not.
Mike BostockYeah. You have to be very motivated to get things done quickly because you don't know what you.
Shan CarterExactly.
Moritz StefanerYou're more time efficient. I think for me it worked out like that, that I was chopping up my work then in really little pieces that I made sure I would manage to do and got much more smart about like what I do when I. Because it could like, it could be over any time. You have to make sure to do the important stuff first and stuff like that. Nice. Okay.
A Very Strange Episode AI generated chapter summary:
Okay. Should we stop here, Moritz? Yeah, I mean, we are at 90 minutes almost. I'm really sorry, guys. It's been a really weird here. Yeah, it was a nice episode, so you can enjoy it.
Moritz StefanerYou're more time efficient. I think for me it worked out like that, that I was chopping up my work then in really little pieces that I made sure I would manage to do and got much more smart about like what I do when I. Because it could like, it could be over any time. You have to make sure to do the important stuff first and stuff like that. Nice. Okay.
Enrico BertiniShould we stop here, Moritz?
Moritz StefanerYeah, I mean, we are at 90 minutes almost. I mean, we could go on for hours.
Enrico BertiniI'm looking forward to listen.
Moritz StefanerYeah, it was a nice episode, so you can enjoy it, I guess.
Enrico BertiniI'm really sorry, guys. It's been a really weird here.
Moritz StefanerNo worries, no worries. Anything else you want to add? Like, did we forget to ask something really obvious or do you want to sneak in something like little. Any announcement? Self promotion. Come on.
WWE12 AI generated chapter summary:
Sean: I think we covered most everything. We're going to put up a post, hope to include as many of the links we talked about as possible. Maybe we can do one in a year or so. It could be fun. Thanks for having us on.
Moritz StefanerNo worries, no worries. Anything else you want to add? Like, did we forget to ask something really obvious or do you want to sneak in something like little. Any announcement? Self promotion. Come on.
Enrico BertiniDo they need any self promotion? More than that.
Mike BostockYou want to add anything, Sean? Okay.
Shan CarterI think we covered most everything.
Moritz StefanerThat's good. Cool.
Shan CarterYeah.
Moritz StefanerIt was fantastic to have you. Very interesting. We're going to put up a post, hope to include as many of the links we talked about as possible. I think listeners will have to see the graphics in parallel, hopefully. It was great hearing the inside story. Maybe we can do one in a year or so. Talking about before. That's all Webgl based.
Shan CarterYeah, right.
Moritz StefanerIt could be fun.
Mike BostockAll right, sounds great.
Moritz StefanerOk, doke, thanks so much.
Shan CarterYeah, yeah. Thanks for having us on.
Moritz StefanerYeah, thanks for joining and cheers.
Mike BostockBye. Bye. Bye, guys. Bye.