Episodes
Audio
Chapters (AI generated)
Speakers
Transcript
Bocoup and OpenVis Conference
This episode is sponsored by CartoDB. Car two DB is an open, powerful and intuitive platform for discovering and predicting the key facts underlying the massive location data in our world. With CartoDB you can design and analyze beautiful and insightful maps.
Yannick AssogbaPart of Openviz is the open source part, but also sort of opening up the community. And if you don't take risks, you sort of stay with the same community that we have.
Enrico BertiniThis episode is sponsored by CartoDB. Car two DB is an open, powerful and intuitive platform for discovering and predicting the key facts underlying the massive location data in our world. With CartoDB you can design and analyze beautiful and insightful maps, check out incredible location intelligence projects and get started for free at CartoDB.
Moritz StefanerHi everyone. It's a new data stories. Hi Enrico. How are you doing today?
Enrico BertiniHey, I'm doing great. Beautiful day today.
Moritz StefanerVery nice. That's great to hear. So, anything new in New York town?
New York Town AI generated chapter summary:
On our website we now have a team page. It's not just Enrico and I doing the show, but there's also Destry, who's our producer. And now we have a slack channel where you can chat with us and other guests. So, shall we dive into the episode?
Moritz StefanerVery nice. That's great to hear. So, anything new in New York town?
Enrico BertiniNo, no, nothing special. Nothing new from. Yeah, since last episode. Yeah, moving on, moving on.
Moritz StefanerYeah. So two updates from our side we can talk about. We have a slack channel. So Slack is a chat application and you can register at the bottom of our website, datastory es. And yeah, you can chat with us and other guests. It's usually quite fun. We discuss the episodes, you can suggest guests, so feel free to join this chat. And second, on our website we now have a team page. It's not just Enrico and I doing the show, as you might think, but there's actually also Destry, who's our producer and helps us a lot with the organization and the contents for the episodes. And Florian, who does an amazing job at making something audible out of the often quite messed up audio files we produce. And yeah, we're very grateful for the help. Thanks Florian. Thanks Destry. And now we have a team page on the website and you can read a bit more about them and get in touch with them. So thanks so much.
Enrico BertiniYeah, I think it's crazy. We never really even mentioned them on the show, so I think it was about time. So thanks Dastri and Florian. I mean, this show cannot happen without you, so that's very important.
Moritz StefanerYeah, that's true. So, yeah, shall we dive into the episode?
Enrico BertiniAbsolutely, go on.
Boku team at OpenVinConf AI generated chapter summary:
Irene Ross is the director of data visualization at Boku. Jim Beaucoup is a developer by trade. When I don't do data visualization, I like to think about food. Who goes next?
Moritz StefanerSo today we have not one, not two, we have three special guests and it's the whole team from Boku. Hi, everyone.
Irene RosHello.
Jim VallandinghamHello.
Moritz StefanerSo that was homosynchronous. So that's a good sign for team spirit, I would say.
Enrico BertiniYeah.
Moritz StefanerSo can you maybe introduce yourself individually, briefly, what you do, what you're working on, what your specialties are? Sure.
Irene RosI can go first. I'm Irene Ross and I am the director of data Visa Beaucoup and I've been there for about five years and I'm a developer by trade. I like making engaging and informative visualization work, and I also help run Openvisconf, which hopefully you've either seen the videos for or have been to. We just ran one a few weeks ago, and when I don't do data visualization, I like to think about food, which is delicious.
Enrico BertiniWho goes next?
Moritz StefanerJim, maybe you are next.
A little about the Boku Dataviz Team AI generated chapter summary:
Jim Vallandigham: I'm part of the Boku Datavis team. Yannick Sagva: I design and develop data visualizations. Peter Vishai: He built some pretty cool stuff for Openviz. Overall it's four people, plus probably a few freelancers, plus minus.
Jim VallandinghamAll right, I'll go. My name is Jim Vallandigham. I'm part of the Boku Datavis team, and I'm also a developer by trade and moved into Dataviz as part of a way to increase the potency, the power of the applications we were building. And I've been with the team for about a year and we've been involved in a lot of different projects and it's been a lot of fun.
Yannick AssogbaAnd I'm Yannick Sagva, also on the Dataviz team at Boku. And yeah, I design and develop data visualizations and have fun doing it with these other fine folklore.
Irene RosI have to give a shout out to our fourth teammate who isn't here, Peter, who joined us recently, Peter Vishai, and he built some pretty cool stuff for Openviz, which maybe we'll talk about later, but he's here in spirit.
Jim VallandinghamCool.
Moritz StefanerBut overall it's four people, plus probably a few freelancers, plus minus.
Irene RosYep. A lot of other folks on the Boku team actually jump in and help us on projects whenever we have something fun that needs doing or when we have too many fun things to do.
Boku: An Open Source Company AI generated chapter summary:
Booku has been around for a while, I want to say six or seven years. It's basically an open web technology and design company. We're primarily a consultancy. Sometimes we actually end up producing new open source technologies.
Moritz StefanerSo can you tell us a bit about what is Booku? How long does it exist? What do you do? What are your main fields of activity?
Irene RosSure. Booku has been around for a while, I want to say six or seven years. I've been there for about five. And we're basically an open web technology and design company, and we're primarily a consultancy, so people come to us from all kinds of industries and from academia as well, an awful lot actually, to help them build whatever they need, build with technology, anything from websites to robots. And generally we use open source technologies to do that. And then sometimes we actually end up producing new open source technologies, often actually as a result of that work, or we actually get directly hired to create open source technology. It's pretty core to our mission to basically help move the technology space into using open practices from engineering to the way we talk to each other. So far it's worked out pretty well.
Moritz StefanerSo can you give us a few examples just to understand a bit concretely, when you say frameworks, what types of things. Have you been working on or commissioned projects? I know you also do a lot of self commissioned work. Can you give us a brief overview of the. Yeah, a little project gallery and maybe we can dive into a few in detail.
Irene RosYeah, sure, we'd be happy to. So we have quite a few projects on our Dataviz page and we thought of a couple that might be interesting to touch on. And obviously Boku has work in other types of services, so from web application development to training, things like that. But we've done some really interesting projects anywhere from just open source tools like working with Jeff Heer's team on Voyager and Lyra, to a little bit in journalism, things like Global Post or the Guardian several years ago. And we've been enjoying working with text. It's one of the kind of themes that we've realized was happening since we did stereotypes last year, and we've also been doing a bunch of teaching. Jim and Janik just did a really great text analysis and visualization workshop for Openvizconf. So just a few dimension.
Enrico BertiniSo why don't we dive into some of these projects? That would be nice. Maybe we can start with Voyager and Lyra.
What is the Mozilla Nightly Grant for VIRGINE and AI generated chapter summary:
Voyager and Lyra are tools that help to automate the communication of data through data visualizations. The Mozilla team used a Mozilla night grant to improve the capabilities of the tools. There are tons of tools out there that could be useful to a larger community, and they just need a little love.
Enrico BertiniSo why don't we dive into some of these projects? That would be nice. Maybe we can start with Voyager and Lyra.
Jim VallandinghamYeah, Voyager and Lyra are both interesting projects that we've been able to work with Jeff here and his team, and it's been really exciting to be part of this, this growing ecosystem of tools around helping to automate the communication of data through data visualizations. So I think you guys have already heard about Voyager a little bit, right?
Enrico BertiniYeah. Maybe you can briefly describe what Voyager and Lyra are.
Jim VallandinghamSure. Yeah. Voyager is a tool that you give it a, a data file and it starts automatically generating visualizations of the different facets of the data, ones that are produced through a recommendation system. So it can take into account some of the actions that the user takes. So you select a particular variable that you're interested in and more visualizations pop up showing you different dimensions rotated around that dimension. And Lyra is kind of an advanced tool for developing quite beautiful and powerful visualizations without coding specifically. And they've recently introduced the ability to harness interactions in the underlying spec Vega. So that interactive capabilities will be part of these tools and part of the, the way you work with them in the very short future, which is really pretty exciting stuff.
Moritz StefanerAnd what is your role? So I know these projects that come out of a research lab from Jeff Heer, which we had on the show.
Jim VallandinghamYeah. To be clear, all credit goes to them. We were fortunate to get a Mozilla night grant for Voyager and use that opportunity to work with them and bring up some of the capabilities in terms of performance and some of the capabilities in terms of UX and UI design. We have a very talented designer, Jess, that helped with prototyping some new looks and centering the flow of the tool and that we were able to implement. And then with another coworker, Katham White, we were able to really speed up the UI and the data processing part. So trying to make it come out of a little bit more robust, a little bit more user friendly, and a little more applicable to the general public. A ways ago, I don't know if you've seen recently a post from Lisa, is that right? She tried out twelve different tools, including Voyager and Lyra, and making the same graph over and over again. Really interesting thing, but highlights some of the capabilities that these tools have, which is kind of fun.
Enrico BertiniYeah, I think that's a very interesting model. And as you can imagine, being a professor myself, I'm very much interested in knowing more how this happens, and I'm really glad to see that this is actually happening. That there are tools that are originally developed in the lab, mostly to be prototypes, and there are people like you that are basically helping researchers transition these tools to something that a much broader set of people can access and use. That's amazing.
Jim VallandinghamYeah, it was a lot of fun and certainly something we want to do more of. Yeah, because you're right, there's tons of tools out there that could be useful to a larger community, and they just need a little love, a little care.
Vega: What is the Open Source Community? AI generated chapter summary:
Jim: Do you also provide services in terms of making this tool available on the web. What are the main components that are missing in research? Prototypes, homemade DIY solutions that you can bring to the table. Many people aren't ready to take on or realize what it means to open source a project.
Enrico BertiniSo, Jim, you've been mentioning the work that you've done on the tool itself. I'm wondering, do you also provide services in terms of making this tool available on the web and making sure that it's easy to, say, download, install, find documentation? Are you also taking care of this part?
Jim VallandinghamYeah, in certain circumstances. I think Jeff here's team for this particular project has done a really good job in trying to organize documentation and stuff around Vega. They have their own tool, their own GitHub, repo and stuff like that. For another project with Santiago Ortiz, we were working on his Moebio framework, and a large portion of that was developing plans around how to form a community and how to engage and allow for other external collaborators to come in and improve the tool and work with the tool. So for that project, it's a code based framework that allows you to kind of take a data set, turn it into its basic forms, like a number list or a string list, and then use those forms to visualize it. And Janik and I worked a lot with the improving documentation, improving the stories around how it might be used and, you know, making like getting it set up prepared for open source usage. You know, picking a license is, you know, a subtle but very critical piece of the puzzle for open source tools. And so that turned out to be a rather lengthy communication, but very fruitful eventually.
Moritz StefanerYeah, yeah, I think that's a really fascinating topic. Like, having a good idea is one thing, or having something that works for one paper or to try something out is one thing, but actually having a product and building a community around something, building a sustainable thing is such a different.
Jim VallandinghamYeah, it's a challenge for everybody involved.
Enrico BertiniYeah, no, but so important. At the same time, I think you are covering a super important role and I would love to see more of that happening. Maybe we should talk more offline.
Jim VallandinghamYeah, yeah, definitely.
Enrico BertiniYeah.
Moritz StefanerAnd what do you think? Like, what is the main value you can bring? Like with the, like, coming more from a general web development and communications and maybe open source background, what do you think? What are the main components that are missing in research? Prototypes, homemade DIY solutions that you can bring to the table? I think you mentioned a few, but.
Jim VallandinghamYeah, I think I liked, I didn't get a chance to see, but I remember you had a workshop about everything but the visualization. Yes, I think more of that providing insight into what is needed is the first step. It's just a lot of people aren't ready to take on or realize what it means to open source a project. And the requirement in terms of time and commitment, in terms of documentation, and if you want it to be successful, and so revealing that demonstrating that in successful projects in the past and guiding that into the future. I don't know. Irene, if you had other comments.
Irene RosYeah, I can jump in. I think. Certainly one of the easiest misconceptions to make about open source is that you just open your code and then that's the end, the story. And that's actually just the beginning of the story because it takes so much more effort to actually get people interested and create enough materials for them to observe the way your project grows and evolves. I mean, if you think about D3, I'm always so amazed at how much work Mike Vostok does. He's there every day. You can always ask him a question. He's answered thousands of questions online. And that's a huge part of why D3, I think, is so popular. Um, even, you know, obviously it's an amazing tool, but it also has this continuing thread over time. And so part of our work, um, aside from both making the code a lot better. So everyone, um, is really good architect, and that's a big piece of it, just making maintainable code. And that's well documented and tested. Um, a big piece is also figuring out how to weave all of, all of those different parts together, how to bring a community, what kind of assets we should create to try and get people on board. It's a really big part of that.
Moritz StefanerYeah, it's true. And, I mean, of course, we feel the same with data stories, and we were just talking about it so great that we have some help now, because in the first few years, we were thinking like, yeah, we're just doing a podcast, and, you know, we just record something and put it online. But there's so much work around it in terms of communication and documentation, just being present and making it a thing, you know, that. Yeah, it can be more than what you would think of as the actual work, let's say. And the other thing is, I think often things, some things just get picked up and develop a dynamic and others don't. Or sometimes it takes years, and something comes out of the, you know, out of the woods, and you were like, it was there all the time. You could have used it all the time, but, you know, and these dynamics are also hard to grasp sometimes.
Irene RosYeah. I mean, even D3 before D3 was Protovis, before Protovis was flare. Right. There's a whole history to that project eventually evolving.
Moritz StefanerIt didn't fall from the tree either.
Irene RosYeah.
Jim VallandinghamNothing.
Moritz StefanerYep.
Jim VallandinghamYeah.
Moritz StefanerNo, but it's great. And I think it's a great model to, as an agency, to take these things and say you work on the professionalization, let's say, of these research tools. The third big part is obviously also, you do a lot of self commissioned projects and experiments and, you know, try to push things yourself. I remember the Stereotropes project, which was a nice take on this direction. So maybe, Janik, maybe you can tell us a bit about this one.
Stereotes: The Visualization of tropes AI generated chapter summary:
Stereotropes is a web based project and visualization of tropes, or descriptions of tropes from tvtropes. com. It provides an interface to sort of look through about 100 of the most used tropes in film and television. Did you develop and identify all these tropes yourself?
Moritz StefanerNo, but it's great. And I think it's a great model to, as an agency, to take these things and say you work on the professionalization, let's say, of these research tools. The third big part is obviously also, you do a lot of self commissioned projects and experiments and, you know, try to push things yourself. I remember the Stereotropes project, which was a nice take on this direction. So maybe, Janik, maybe you can tell us a bit about this one.
Yannick AssogbaSure. I can maybe give a quick description of it first and then sort of tell you sort of how we ended up doing it. So Stereotropes is a web based project and visualization of tropes, or these descriptions of tropes from tvtropes.com. and those are really descriptions of characters and themes that appear in, in film and in media often and repeatedly, that we sort of recognize and sort of have these amusing names like Papa Wolf or Mama Bear or the damsel in distress or the scary black man or something like that. And these are things you see in lots of different media. And there's this really interesting community out on tvtobes.org, comma, that sort of documents and records and describes and sort of has this discussion around these things and captures them. So we became interested in sort of using visualization, and when we're doing sort of self commissioned work, we can sort of go to different places using visualizations to look at this aspect of culture and sort of examine it and put it under a bit of a light. So it basically provides an interface to sort of look through about 100 of the most used tropes in film and television and particularly through the lens of gender. What is it? What kinds of things are associated with female tropes and male tropes? So we actually start with these two lists that they have of always male tropes, always female character tropes, and wanted to examine that and see how are these characters described? What is it saying about the portrayal of men in media and of women in media? And how does that differ across different kinds of things?
Moritz StefanerAnd did you develop and identify all these tropes yourself? Or are there already trope collections with an API that you can connect to, or how does this work?
Yannick AssogbaSo that's one nice part. These descriptions are developed by this community. So it is people looking at media, critiquing it in community and trying to figure out what these things are from an access point of view. The API is Irene scraping the data and cleaning it up. And actually there was some help from a project called DB tropes that we could shout out, but a mix of sort of pulling that together. It's actually a wiki. So on one hand, it's a very interesting, engaging community go to. You can spend hours on tvtropes.org and I have, and it's a wiki of people documenting these things across a lot of different media and where they appear and how they relate to each other. And some are very common. And you read these things and you're like, oh yeah, I've seen that in like ten movies or something, or the hot scientist or something.
Enrico BertiniCan you maybe give us some examples? Are there anything you.
Yannick AssogbaYeah, so I'm sure like ones that may be quite familiar. Are things like a femme fatale or a hot scientist, for example, or dumb muscle, a sort of strong but otherwise unintelligent, typically male character, or a screaming woman who just sort of appears and just screams and has no words or lines in a film.
Moritz StefanerI have a whole movie in my.
Yannick AssogbaHead already it's happening, but some are a bit more subtle. So, like, one that I like and find funny is the beard of evil. And just this common trope of, like, the more evil characters having beards and sort of, what does that say? So there's quite a few, and they're pretty funny. And what we did is take the descriptions of them and pull out adjectives. So using some NLP stuff, pull out some adjectives, and then see what kinds of adjectives are most associated with. A particular trope is one thing you can examine on the site and also with male tropes overall versus female tropes. So what kinds of things tend to be used to describe roles that women are often put into or men?
Moritz StefanerIt's interesting. It's almost like a digital humanities project in a sense, that it's a topic you would typically write an essay about, like, maybe five years ago and now, you know, 2014 or when you did it, you make a database, and suddenly you look at all these tropes as a whole and analyze which are the outliers, which are the typical ones. So I think that's a very timely approach here. And a good topic. Yeah, I'm actually working on a narrative pattern collection right now, so it's basically, it could be very similar. So some of the narration patterns, if you take them to the extreme, might become a trope. And then these two projects might need.
Jim VallandinghamI like just exploring. Like, beard of evil shows up in Disney movies. Lion King, scar, it has a little beard of evil, and Pocahontas, the bad guy in there, also beard of evil. Since I have a beard.
Moritz StefanerYou should sue all of them. All of this. Yeah.
Jim VallandinghamInappropriate use of beard. But, yeah, it's fun to see the spots where they show up, and then you come back and you think about, that does make sense. That is a trope. That is something that I've seen over and over again. I didn't realize it's been ingrained into my conscious of how I should react to this character.
Irene RosOne of the best things that came out of stereotypes is that we actually had a lot of conversations with people who have nothing to do with technology. They were just really curious about the things we were finding in the data. And because it was a tool, they could go back and read things and look at the movies where things appeared and everything was so cross linked. We were certainly accused of making a rabbit hole, that it was really nice to just have conversations about culture and gender and media really separate. I almost feel like that was the greatest testament to stereotypes, being a good project, a lot of people didn't talk about it necessarily outside of our community. It just helps them dig in and think about these topics.
Projects in the Making AI generated chapter summary:
We generally come up with ideas for things like this ourselves. We do a lot of prototyping around both the analysis and some of the information design. We narrow it down until we feel like we hit the right mark or we give up. It's pretty iterative.
Enrico BertiniSo how do projects like this one start, and how do you decide whether it's a good enough project to commit on it?
Irene RosWell, so we generally come up with ideas for things like this ourselves, and we're actually starting to think about our next one, so we're pretty excited about that. And we have some brainstorm sessions. We all think about data that we want to explore a lot of the times. We'll do some data analysis in advance and just see is there anything interesting there. We also do a lot of prototyping, so that's a big part, actually, of the projects we work on, as we do a lot of prototyping around both the analysis and some of the information design. So we kind of narrow it down until we feel like we hit the right mark or we give up on that dataset and go to something else. It's pretty iterative.
Boku's Commercial and Open Source Workbalance AI generated chapter summary:
How do you balance doing commercial commissions with the more long term? And then the open source activities and the self commissioned stuff? How do you organize this balance?
Moritz StefanerAnd how do you, I'm always interested practically, how do you balance all this? Okay, so you live in Boston and you have a studio there, so probably that costs a bit of money. And so how do you balance doing commercial commissions with the more long term? I guess a Knight foundation grant is probably a bit more long term, a bit more calculable. Then you organize a conference that we will talk about in a minute as well. And then the open source activities and the self commissioned stuff, like how do you, do you have like quotas, like, or do you say on Friday we just work on fun stuff, or do you sort of have different periods of activities? How do you organize this balance? I'm super curious about this.
Irene RosSure. Yeah. So we generally, when we work on commercial work, we only work on one project at a time at Boku. That's something that we do for the sanity of all of our staff.
Moritz StefanerThat's a very smart idea.
Irene RosYeah, it's really hard to switch context. And so, you know, we have commercial work that is the majority of our time, and we have a construct that we call perch time, which I know in some environments is called benchmark. We don't look at it that way. It's really our time for learning and our time for creating things that we're excited about exploring. And so we have an amazing financial director shout out to jasmine, who helps us keep track of all the numbers. And we've been really, really fortunate that we've had a ton of really exciting work that some of the open source work ends up being commercial work at the same time and things like that. So it's a balance, but we're really excited about our field, so we really want to make time for all kinds of projects. And it's really the blend, honestly, that makes it work really well.
Moritz StefanerAnd if you have just one active project at a time, how do you deal with this huge padding at the beginning, at the end of projects where you're basically done, but there's still a couple of revisions or you can almost start. We just need the data, and then it doesn't come for weeks. Like, how do you deal with that? It might be a very specific question, but I want to improve my own practice.
Irene RosIt certainly happened. So we have a model called continued access that we actually just started where it's access to one of the folks on the team at a smaller amount of time and obviously a more favorable rate. And that's anywhere from assisting with engineering tasks and fixing bugs and things like that to actually just helping think through, you know, higher level problems or architecture, things like that. And so that's kind of one way that often projects will, you know, once they're done, we'll sort of transition to that mode. But then earlier on in the beginning of projects, we generally do kind of a research phase where, you know, we make sure we do a lot of conversations before we kick off to make sure everything that we need to get started is there. And we'll wait if it's not. And then once we do get going, we can generally hit the ground running. That's always what we strive for, but we'll spend a lot of time doing research and design and a lot of prototyping before we kind of jump into the full on development. In the end, it always saves us time. So that works out pretty well.
Jim VallandinghamYeah, it's about communication. Communicating with a client, communicating back and forth, getting everybody on the same page is the hard part, I guess. And data access is, is obviously part of that.
Yannick AssogbaI've often found that the start of projects, there's a bunch of getting to know the client and their context that you can do while waiting for data as you poke at them to be like, oh, yeah, send that thing that you said you had when we first started talking. But then there's getting to know more about their field. I think one thing that's interesting about technology in the computing space is getting to dive into different other fields a little bit. So reading up on that stuff on which we are not experts is also a fun use of some of that initial time.
Exploring CartoDB's Data Observatory AI generated chapter summary:
The data observatory allows you to augment your data by providing additional measurements of populations, jobs, commerce, and many other interesting location based dimensions. If you want to see how this works, go to the link CartoDB. com data and find more information.
Enrico BertiniThis is the right time to take a little break and talk about our sponsor, CartoDB. CartoDb is a web based application that allows you to load location data displayed using a lot of different geographical mapping methods, and then discover new information and generate new insights by using many of the functionalities they provide. And today I want to talk about a specific one they have recently introduced. This is called the data observatory. What is the data observatory? Well, it's an additional layer that allows you to augment your data by providing additional measurements of populations, jobs, commerce, and many other interesting location based dimensions that you can find directly within CartoDB. And as you can imagine, this is very powerful because you're no longer restricted to the information that is already contained in your data, but you can expand it with many different measurements. So there is one example that I found really interesting that you can find in CartoDB's website that shows how this works exactly. You can, for instance, select one specific location on a map, and CartoodB would provide for you. How many people can reach this point from a walking distance? And for each block around this point, what is the per capita income, median age, and other measurements? So it's very, very powerful. If you want to see how this works and get more information about how the data observatory works, you can go to the link CartoDB.com data and find more information. And now back to the show. Product development is not the only thing that you do. So you also teach courses, right? And of course, you're also organizing Openviz that we want to talk about. So maybe can you tell us a little bit about your courses, what you teach and how, and maybe we can move on to open this?
OpenViz: Teaching and Development AI generated chapter summary:
And now that we have transcripts for the podcast, I would love to see more happening on the product side of things. For Openvizconf, Jim and Janik developed a workshop around text analysis and visualization, which was amazing. If you're looking for a project to do, find a paper or ping us.
Enrico BertiniThis is the right time to take a little break and talk about our sponsor, CartoDB. CartoDb is a web based application that allows you to load location data displayed using a lot of different geographical mapping methods, and then discover new information and generate new insights by using many of the functionalities they provide. And today I want to talk about a specific one they have recently introduced. This is called the data observatory. What is the data observatory? Well, it's an additional layer that allows you to augment your data by providing additional measurements of populations, jobs, commerce, and many other interesting location based dimensions that you can find directly within CartoDB. And as you can imagine, this is very powerful because you're no longer restricted to the information that is already contained in your data, but you can expand it with many different measurements. So there is one example that I found really interesting that you can find in CartoDB's website that shows how this works exactly. You can, for instance, select one specific location on a map, and CartoodB would provide for you. How many people can reach this point from a walking distance? And for each block around this point, what is the per capita income, median age, and other measurements? So it's very, very powerful. If you want to see how this works and get more information about how the data observatory works, you can go to the link CartoDB.com data and find more information. And now back to the show. Product development is not the only thing that you do. So you also teach courses, right? And of course, you're also organizing Openviz that we want to talk about. So maybe can you tell us a little bit about your courses, what you teach and how, and maybe we can move on to open this?
Irene RosYeah, sure. So we have kind of a variety of different workshops at Boku that we teach. We've recently switched to a shorter format where they're actually kind of an afternoon, three hour workshop that is a part of a series. Sometimes it doesn't have to be, but we've taught some workshops around user centered design already that were really great. We actually partnered with a nonprofit in New York and did some design work for some of their projects while teaching design. And then our team, more specifically, has a lot of material around D3. So we've done anything from half day to kind of three, four day classes around D3. And then more recently for Openvizconf, Jim and Janik developed a workshop around text analysis and visualization, which was amazing. Maybe I'll let them talk about that.
Jim VallandinghamYeah, it was a lot of fun. Yeah, I mean Yannick had had a lot of textviz and text analysis experience in the mini eyes projects and in stereotypes. And so we kind of wrapped that together in terms of starter course in learning how to do some analysis in NLTK, a python package, and then transitioned into kind of a fun couple hour explosion of data. Vis mostly a lot of the textviz stuff comes from the academic field. So it was a chance to introduce people from industry to some of that work. And we got people excited about implementing perhaps open source or open varieties of some of the tools that you see from the academic world again to try to bridge that gap a little bit better. But I think it turned out really well. It was a full day, was a lot of work, a lot of practice, a lot of cutting of content, as you probably can guess. But everybody had a good time. The hardest part is getting everybody on the Wi Fi and downloading data always ruins the workshop and the local web.
Moritz StefanerServer took up the other half. Right?
Jim VallandinghamYeah, exactly, exactly.
Enrico BertiniYeah. No, but I have to say text based is such a fascinating area. And you're right, there has been quite a, quite a bit of research in this area for a few years. We had Chris Collins on the show a few months back and he does amazing, amazing type of research. Yeah, we love his work, but I would love to see more practical tools coming up and I think that's a very interesting space and yeah, I would love to see more.
Yannick AssogbaI'll make another shout out for listeners. As you mentioned at the end, we basically, one of the exercises we gave was like hey, let's take some papers for some visualizations we know we like, and let's make open source, JavaScript, web enabled versions of this.
Enrico BertiniOh yeah, that's a great idea.
Yannick AssogbaIf you're looking for a project to do, find a paper or ping us if you want to make my favorite text visualizations from researchers and just make one and that could be cool.
Jim VallandinghamAnd we had small groups, we had 30 people in the workshop and everybody broke up into small groups and there were some attendee contributed concepts as well that were very successful. But I was very impressed at how much progress people made in the time that we allotted for that, that kind of hands on implementation. We had about 2 hours for that and it was a lot of fun. And so hopefully we can send some links of the projects that were kind of works in progress.
Enrico BertiniOh yeah, that would be nice. I would love to see these projects. No, but again, I think Textviz is a very, very interesting area and I would love to see more happening on the product side of things. Absolutely.
Moritz StefanerAnd now that we have transcripts for the podcast, I mean, you know, there's ample opportunity.
Jim VallandinghamThere you go. Yeah, well, that's what. That's what we did at Openviz.
Enrico BertiniThat's one of the reasons why we started collecting, creating transcripts. A few months or even years down the line, we're gonna have a very interesting repository, and hopefully people. Yeah, we'll just play with it.
Moritz StefanerYou did something amazing for the Openviscon video archives. I just saw that.
Jim VallandinghamYes.
Moritz StefanerAnd we should talk about the conference anyway, so maybe. But let's talk about that video archive, because I think it's such a smart idea. Can you. Who built that and how did you do it?
The Video Archive at OpenViz AI generated chapter summary:
Peter: Who built that video archive? It was actually a build by Peter. The transcripts were also being streamed live online, so that you could just go and watch them on the website. I think it's an amazing use of data to make talk video more accessible.
Moritz StefanerAnd we should talk about the conference anyway, so maybe. But let's talk about that video archive, because I think it's such a smart idea. Can you. Who built that and how did you do it?
Irene RosYeah, so this was actually a build by Peter. It was his first boku project, and we had the transcripts from Openviz, which were really amazing. It was so great to have that resource. Last year, we tried to more automatically extract some of the text using a few tools we found, and it did not so great a job. Not surprisingly, things like WebGl are not easily detectable, and so there was a lot of manual correction to that version. And then in this situation, we were really, really fortunate that our transcriber, Amanda, did such a great job.
Moritz StefanerSo during the conference, she live transcribed as well, right?
Jim VallandinghamYeah, it was mesmerizing, too. Everybody was, you know, it was great and bad. Everybody wanted to watch the transcripts float up on the screen. It was impressive stuff.
Irene RosIt was so popular. We really didn't expect that. We kind of wanted to make it accessible to people who couldn't be there. And because the transcripts were also being streamed live online, so that you could just go and watch them on the website. And some people did. It was really interesting. They kind of pieced together things from Twitter and from the transcripts, people were posting photos, so it was fun. But, yeah, Peter took the. The transcripts we had and using TF IDF scores for individual terms and bigrams, picked out some of the top n for all of the different talks. And then kind of, we all came up with this concept of this sort of film strip of thumbnails. There were, I think, 30 thumbnails being taken for each of the talks. And we actually used for inspiration one of the New York times pieces around fashion shows that sort of had this slidey accordion that we really liked. So it was a little bit inspired by that. And then some of Jannik's work on stereotypes around kind of the gender panel that kind of compared male and female individual adjectives, inspired a little bit of the term layout for the words, and then kind of the rest of the touches came together from Peter. So, yeah, it was really fun to build, and the terms were very, very telling of the talks. Having seen them, it was actually really exciting to see. Wow, these are so on point. So we felt like it would be a useful way to get into the talks.
Moritz StefanerYeah. And you quickly get a sense of what is the talk about, but also, does it have different chapters that are wildly different, or what's the narrative structure? And for the key concepts you might be interested in, you see where they appear in the talk, so you can just jump to that point where they talk about networks or something. And so I think it's an amazing use of data to make talk video more accessible. Great job.
Irene RosThanks.
Moritz StefanerOpen this conference in general. I mean, it's a big thing that could probably keep you busy full time already. So can you tell us a bit about the conference? It's been around for a few years. Three or four maybe.
The Conference on Data Visualization AI generated chapter summary:
This was the fourth annual data visualization conference. It brings together practitioners in the space to learn how to do the work. The selection process is very unique because usually there's two competing models. We had about a little over 200 talk submissions this year.
Moritz StefanerOpen this conference in general. I mean, it's a big thing that could probably keep you busy full time already. So can you tell us a bit about the conference? It's been around for a few years. Three or four maybe.
Irene RosYeah, this was our fourth. Yeah, it's definitely our baby. We're very proud of it. We started it as a way to both bring together actual practitioners in the data visualization space to learn how to actually do the work. I think, especially at the time that I was new to data visualization a while back, it was hard to figure out, how do I do some of these things? And I shadowed people around and watched them. But we really wanted to make a place where that was kind of the norm and the culture and kind of shift our community towards being a more open, transparent community about how we do the work. We love talking about our process and the tools that we use, and we wanted more people to do that. And so we ran the first one at the Museum of Science in Boston, which was an amazing place, and it's really grown every year. We've managed to kind of increase the number of people that we can let in. This year sold out really quickly, which was amazing for us, but also we want to make sure everybody can come. So that's a great, great problem to have. And, you know, there's lots of work that goes in towards putting the program together. So we have a committee of seven people that the three of us are on, and then we also have non boku folks. So Lynn Cherny is my co chair. And then we have Gabriel Florent and Nicholas Diacon and Alex Growl, who have helped us tremendously over the years. And we spend a lot of time months, definitely putting the program together. So we have an open call that goes out and we try to think of kind of topics that we think might be interesting and list those. And then we do. We reach out to a lot of people and try to talk to them about their work and see if they're interested in submitting. And then everything goes into our big submission queue. We had about a little over 200 talk submissions this year, which was pretty incredible. Yeah, we never budget enough time because everyone waits for the last minute to submit their talks. Please don't do that. I'm just kidding. And so we spend weeks really reviewing things, and we'll often actually build tools around our reviewing process because every year we evolve it a little bit. And there's really not great off the shelf tools for doing some of the kind of reviewing that we do.
Moritz StefanerIt's a great chance for data visualization, obviously.
Irene RosIt's so tempting. It's really tempting.
Moritz StefanerMulti dimensional, like, you know, trade offs being balanced in real time.
Jim VallandinghamExactly, exactly. Well, that's. The conference itself is kind of the nexus point of so many disciplines. Databases is the overlap of so many disciplines. So highlighting each of those areas in a conference organization is difficult and something that I think Irene and Lynn do a great job every year at achieving.
Moritz StefanerThis selection process is very unique because usually there's two competing models. Let's say the one is, let's say the design or business conference model, where some committee reaches out to people and invites them, and then they talk about whatever they want, or they have a rough briefing, but in principle, yeah, they're just invited as people. Or in academia, you would submit a paper and then the paper gets accepted or not, or maybe sometimes an abstract. And so yours seems to be sort of halfway in between, in the sense that people need to apply with a topic and like an idea of a talk, but then it's just this idea. And then if you get accepted, you do the full talk. Right? Is that on purpose or how did you come up with that model?
Irene RosYeah, I mean, it is a, we wanted to make the barrier to applying as low as possible. A paper is certainly a pretty high barrier to get in. And we also did not want to just do an invitation only conference because we're aware of our own biases and our own networks. In a sense, if we only invited people for four years, we would run out of people to invite. And it's still really hard because we bring so many multidisciplinary people to speak from completely different fields. It's still really hard to find them. But we're always surprised to receive applications from people we've never heard of. And it's actually really important. We have lots of principles we try to uphold, and one of them is to bring in people who are new to the field or who are completely outside of our kind of direct community to share what they know. And so we always look for people who maybe none of us have ever heard of but are going to come there and just knock everybody's socks off. And that happens all the time. And it's so great when it does.
Moritz StefanerYeah, that's the best. Of course. Yeah.
Enrico BertiniThat's such an important thing. And I've been organizing myself a few very small events, and it's always hard to come up with the right principles. And you're totally right. We all have our own biases, and we see them in data stories, by the way. Right. We are always, always discussing this thing. Right. Who should we invite next and how are we going too much into this direction or that direction? And. Yeah, and by the way, that's one reason why we like receiving suggestions from listeners who they want to see next. That's, that's very important. And so hard. It's so hard to find the right balance. Right. Because on the one hand, you do want to be kind of like the editor of something, right? I mean, you want to give a style to the event that you are, that you are organizing, but at the same time, you don't want the event to be closed or to fully reflect your own biases. Right.
Boku Conference 2018: The Diversity Program AI generated chapter summary:
This year's Boku summit had the most diverse program ever. It's so hard to find the right balance. The editorializing part is really interesting. People appreciate being there and speaking to the audience.
Enrico BertiniThat's such an important thing. And I've been organizing myself a few very small events, and it's always hard to come up with the right principles. And you're totally right. We all have our own biases, and we see them in data stories, by the way. Right. We are always, always discussing this thing. Right. Who should we invite next and how are we going too much into this direction or that direction? And. Yeah, and by the way, that's one reason why we like receiving suggestions from listeners who they want to see next. That's, that's very important. And so hard. It's so hard to find the right balance. Right. Because on the one hand, you do want to be kind of like the editor of something, right? I mean, you want to give a style to the event that you are, that you are organizing, but at the same time, you don't want the event to be closed or to fully reflect your own biases. Right.
Irene RosAbsolutely.
Enrico BertiniIt's so hard.
Moritz StefanerAnd of course, you run a higher risk if you, you know, you let somebody speak. Nobody has ever let let speak before. You know, it's like, yeah, you're putting yourself out there in a sense that, yeah, you're risking a bit, but I think then the beautiful thing is in summit pays off. So I think this year you had the most diverse program ever. Like, you know, it was just super mixed up and super colorful, like, overall, like, both from topics and people. And I also feel it was, content wise, maybe the best edition so far. Right? I mean, is that, I mean, it was my impression, at least. So it seems to pay off, right?
Irene RosYeah. There's a lot of work to. The editorializing part is really interesting because it is such attention and the way that the program comes together, we always have a top end of talks that are at least two to three times bigger than the actual space that we have. And then we actually kind of drop the, you know, obviously we remember who the people were but we go to a more conceptual planning level where we start thinking about, okay, this is a talk about systems. And then we're going to bring in design for, you know, real time craft, you know, spacecraft operation. And how do those things weave together. And the ordering matters a lot. So we do try to create kind of a two day flow through all these different topics. And sometimes certain talks won't make it because they don't fit as well into the flow and others will. And so that's a pretty big piece of it. And then as far as some newer folks and how to support them, we actually offer. We talk to our speakers a lot beforehand. Once they're accepted, they have complete access to me and everybody else at Boku and on the committee if they wanted to. We've done anything from like coaching sessions to run throughs to let's just brainstorm about your topic. Our speakers have been so engaged. It's really been amazing. And I'm like so grateful that every year they come and they're just right there with us the whole time. So I think that's a really big part of it. People appreciate being there and speaking to the audience.
Enrico BertiniYeah, yeah. And I have to say another aspect that is very important for me is that you are mixing academics and practitioners. And I see this as one of the most interesting aspects or feature of Openviz. I have to confess I've been kind of jealous the last two editions because I really wanted to participate and I think I missed the deadline for last year for a few seconds or so. But. No, I mean, that's great. Thanks for organizing it and for organizing it it this way, because I think I'm a big, big proponent of mixing people from different backgrounds, as I said, especially from academia and practitioners, because there are not many opportunities to let these people talk to each other and it's so, so important.
OpenViz: Mixing Academics and Practicians AI generated chapter summary:
Openviz mixes academics and practitioners. I see this as one of the most interesting aspects or feature of Openviz. They should literally make it harder for academics and easier for practitioners. Having that bridge is at least a start in opening up these two fairly isolated parts of the Datavis community.
Enrico BertiniYeah, yeah. And I have to say another aspect that is very important for me is that you are mixing academics and practitioners. And I see this as one of the most interesting aspects or feature of Openviz. I have to confess I've been kind of jealous the last two editions because I really wanted to participate and I think I missed the deadline for last year for a few seconds or so. But. No, I mean, that's great. Thanks for organizing it and for organizing it it this way, because I think I'm a big, big proponent of mixing people from different backgrounds, as I said, especially from academia and practitioners, because there are not many opportunities to let these people talk to each other and it's so, so important.
Irene RosAbsolutely. I mean, we all have dipped our toes in academic publishing at one point or another, and so we're familiar with these communities. You know, I would love to go to infovis if I could find a way to afford it, but I just read the proceeds afterwards.
Enrico BertiniThey should literally make it harder for academics and easier for practitioners, right? It's kind of like practitioner discount. They should do something like that.
Jim VallandinghamGet on the committee and get that figured out.
Enrico BertiniI mean, it's counterintuitive, right? You expect people depending on you from business to be wealthier than people from academia. But what happens in practice is that academics, professors like me, they already have budgeted money to go to these events. Right. But you don't. So that's. Yeah, that's a tricky issue.
Jim VallandinghamYou mentioned we had Chris Collins and a few other academic people this time around. And so, yeah, having that bridge is at least a start in opening up these two still fairly isolated parts of the Datavis community.
Enrico BertiniIt's getting better.
OpenViz 2017: A Year of Risk AI generated chapter summary:
The talks are all online already, so, dear listeners, you can check them out. Do you have any favorites, like, from this year? They all provide different. How can that be more diverse and more grown?
Yannick AssogbaI wanted to make a quick comment about what Moritz said about risk. And I think sort of taking some risks is important in. I think part of Openviz is the open source part, but also sort of opening up the community. And if you don't take risks, you sort of stay with the same community that we have. And that's a part of what we'd like to see evolve as well. How can that be more diverse and more grown? And that means new people, which means you haven't heard them before. So sometimes we just have to. To take those chances and they pay off.
Jim VallandinghamThey pay off like naughty stock. I hadn't heard naughty talk before, and she destroyed it.
Irene RosSurvey says.
Jim VallandinghamSame for lots of folks.
Moritz StefanerI mean, the talks are all online already, so, dear listeners, you can check them out. You can just browse through that amazing text visualization tool and pick the topics you like and jump somewhere. I would assume all of them are pretty much great. Do you have any favorites, like, from this year? If you spontaneously one talk, you want to highlight, but it's difficult. I know.
Irene RosWe can't. We can't love them all.
Jim VallandinghamLove them all. They all provide different. Kyle McDonald blew everybody away with machine learning. Nadieh's. You can't miss her slides and you can't miss her talk. Marie goes. Story flow was amazing. Basically, everybody, you can't miss. Go watch them. All right now it's Friday.
Friday: Binge Watch the Weekends AI generated chapter summary:
Turn off slack. Watch the movies. Binge watch. Open this. You can learn so much about data visualization with a couple of conferences. I mean, it's the best way to learn.
Jim VallandinghamLove them all. They all provide different. Kyle McDonald blew everybody away with machine learning. Nadieh's. You can't miss her slides and you can't miss her talk. Marie goes. Story flow was amazing. Basically, everybody, you can't miss. Go watch them. All right now it's Friday.
Enrico BertiniEnough task for the weekend.
Jim VallandinghamJust. Yeah, turn off. Turn off slack. Turn off your phone. Watch the movies.
Irene RosForget Netflix.
Jim VallandinghamForget Netflix. Yeah, this is better.
Enrico BertiniYeah. Binge watch. Open this. Yeah.
Jim VallandinghamAdam Pearce. Adam Pearce. We got Adam Pearce. That's amazing.
Moritz StefanerNo, but it's true. I mean, you can learn so much about data visualization with, like, a couple of conferences and then picking the right talks. I mean, it's the best way to learn.
Jim VallandinghamYeah. Yeah.
Moritz StefanerCool. Anything else? Any closing remarks, statements?
OpenViz Confocu 2019: What to Do Next Year AI generated chapter summary:
Enrico: We need holograms next year. Holograms. Please submit a proposal. This is going to be our fifth year. We've already started brainstorming for fun things we can do. If we can make it a bigger event somehow, you know, come hang out with us.
Moritz StefanerCool. Anything else? Any closing remarks, statements?
Enrico BertiniEnrico, I just want to know what is happening next year. I mean, I want to participate this time. I will miss it.
Moritz StefanerHow do we not miss it next year? That's our question.
Enrico BertiniI'm going to submit something. So if I can, I would submit it now.
Irene RosWe will definitely send you guys a note when the call opens for proposals, because we'd certainly love to have you guys submit and spread the word. I don't know, but this is going to be our fifth year, which is a pretty big deal. So we've already started kind of brainstorming for fun things we can do. I don't even know yet where it's going to be, but I know that having an IMAX theater screen definitely set the bar pretty high.
Jim VallandinghamWe forgot to mention it. It was in the aquarium, the IMAX aquarium this year, which was quite impressive.
Moritz StefanerSo we need holograms next year.
Jim VallandinghamHolograms. Yeah.
Irene RosHologram visualization.
Jim VallandinghamPlease submit a Todd everybody has a VR kit.
Irene RosGoogle cardboard for everyone.
Jim VallandinghamGoogle cardboard, everybody.
Moritz StefanerSmell of rain.
Jim VallandinghamTurn it on. Watch out. That's Mart's behind you.
Yannick AssogbaFace swapped with somebody else.
Irene RosThat's right.
Enrico BertiniYeah.
Jim VallandinghamYeah.
Irene RosSo, yeah, so we'll see. We've also, this was the first year we did workshops, and they went really, really well. And so I'm sure we'll explore doing those again and next year. It's always nice to have. We always wish Openviz was a longer event, but then it's hard to make just a longer conference because we also try to keep it really, really affordable. So it's really kind of at that threshold for us. And so if we can make it a bigger event somehow, you know, come hang out with us. Let's do a hack day. Let's work on projects together. You know, we're open to ideas, too, so you can always email openvizconfocu.com and go straight to my mailbox.
Moritz StefanerGood to know.
How to Attend the Conference 2019 AI generated chapter summary:
We actually have a slack channel that we started. People have coordinated things during the conference and exchanged notes. We're super open to growing that community, however our community wants it to grow. Next year, we'll be looking for your. Proposal that's now on the air.
Enrico BertiniSo, Irene, can you summarize how people can participate other than, of course, just registering and coming?
Irene RosSure. Yeah. So this was our first year. We actually have a slack channel that we started. Jim's been running that. It's been amazing to have folks hang out there. People have coordinated things during the conference and exchanged notes and things like that. So that's still happening. We also obviously release all of the data for the video visualization so the transcripts are shared, as is Peter's code. So if anybody wanted to remix that in some way, that would be great. I know there's been a lot of other visualizations of the tweets that were happening. They were really for both, just a lot of really good content there. We also, last year was the first time we tried to do kind of some collaborative note taking, and we did that again this year. I don't think it was as successful because people were just paying attention, which is great. So there are still some notes for some of the talks and they kind of aggregate links together.
Jim VallandinghamYeah, we have the transcripts. Yeah.
Irene RosAnd some things are, and we had the transcripts. I think that was definitely one of the reasons it was less used. But those are just some. I think we're super open to growing that community, however our community wants it to grow.
Enrico BertiniWell, perfect. Thanks a lot. I mean, it's fantastic what you guys are doing, both in terms of your company and open this conference. I'm very much looking forward what happens next. And as I said, I would definitely participate. Next year, we'll be looking for your.
Moritz StefanerProposal that's now on the air. And that's the fact.
Enrico BertiniIt's a test.
Moritz StefanerWe'll check back in half.
Enrico BertiniYou can use this snippet against me in a few months.
Moritz StefanerSo thanks so much for coming. It's been great having you.
Yannick AssogbaThank you.
Jim VallandinghamThanks a lot.
Irene RosThanks for having us and for doing this podcast. It's great.
Enrico BertiniThank you. Bye bye.
Jim VallandinghamBye bye.
Enrico BertiniHey guys, thanks for listening to data stories again. Before you leave, we have a request if you can spend a couple of minutes rating us on iTunes, that would be extremely helpful for the show.
Data Stories AI generated chapter summary:
Before you leave, we have a request if you can spend a couple of minutes rating us on iTunes. Here's also some information on the many ways you can get news directly from us. Don't hesitate to get in touch with us. It's always a great thing for us.
Enrico BertiniHey guys, thanks for listening to data stories again. Before you leave, we have a request if you can spend a couple of minutes rating us on iTunes, that would be extremely helpful for the show.
Moritz StefanerAnd here's also some information on the many ways you can get news directly from us. We're, of course, on twitter@twitter.com. Datastories. We have a Facebook page all in one word, and we also have an email newsletter. So if you want to get news directly into your inbox and be notified whenever we publish an episode, you can go to our homepage datastory es and look for the link that you find on the bottom in the footer.
Enrico BertiniSo one last thing that we want to tell you is that we love to get in touch with our listeners, especially if you want to suggest a way to improve the show or amazing people you want us to invite or even projects you want us to talk about.
Moritz StefanerYeah, absolutely. So don't hesitate to get in touch with us. It's always a great thing for us. And that's all for now. See you next time, and thanks for listening to data stories.
CartoDB AI generated chapter summary:
This episode is sponsored by CartoDB. CartoDB is an open, powerful and intuitive platform for discovering and predicting the key facts underlying the massive location data. With cartodB, analyzing and designing beautifully insightful maps has never been easier.
Enrico BertiniThis episode is sponsored by CartoDB. CartoDB is an open, powerful and intuitive platform for discovering and predicting the key facts underlying the massive location data in our world. With cartodB, analyzing and designing beautifully insightful maps has never been easier. Check out incredible location intelligence projects and get started for free@CartoDB.com. gallery that's CartoDB.com gallery.