Sense & Scale

A site to explore cultures, cities, and computing at varying senses and scales. Updated by Ar Ducao, with content from classes at NYU, MIT, CUNY and more.

Contact: see syllabi

  • Announcements http://www.datasociety.net/ Agenda 6:30-7:00 Class updates 7:00-7:30 Bex Hurwitz, Founder, Research Action Design 7:30-8:00 Timeline.js Exercise 8:00-8:10 Break 8:10-9:20 Finish Timelines, add Personas Assignment for Nov 5 No Class next week! Expand on your timeline exercise and post to your blog OR upload and discuss the persona design images that your classmate…

    Announcements

    Agenda

    6:30-7:00 Class updates
    7:00-7:30 Bex Hurwitz, Founder, Research Action Design
    7:30-8:00 Timeline.js Exercise
    8:00-8:10 Break
    8:10-9:20 Finish Timelines, add Personas

    Assignment for Nov 5

    No Class next week!

    • Expand on your timeline exercise and post to your blog
      OR upload and discuss the persona design images that your classmate designed for your midterm project.
    • (optional) Make any final modifications to your midterm. If you do this, let me know when you’ve finished so I can reissue your progress report. I will lock down midterm grades on Nov 5. If your grade is low, I will be in touch before then.
    +
  • Data is in this TimelineJS template.

    +
  • Presentation Order 6:30-6:35 Announcements 6:35-6:45 Sandra Song 6:48-6:58 Daniel Silber-Baker 7:01-7:11 Neill Chua 7:14-7:24 Sweta Mohapatra 7:27-7:37 Sam Tarakaijan 7:40-7:50 Hovsep Agop 7:50-8:10 BREAK. Please fill out this class evaluation. 8:00-8:10 Helen Carey / Sriya Sarkar 8:13-8:23 Gabe Weintraub 8:26-8:36 Marijke Jorritsma 8:39-8:49 Patrick Moraitis 9:02-9:12 Anneka Goss 9:12-9:20 Final Notes Announcements http://monoskop.org/Bauhaus Next Week’s Assignments Post all your…

    Presentation Order

    6:30-6:35 Announcements
    6:35-6:45 Sandra Song
    6:48-6:58 Daniel Silber-Baker
    7:01-7:11 Neill Chua
    7:14-7:24 Sweta Mohapatra
    7:27-7:37 Sam Tarakaijan
    7:40-7:50 Hovsep Agop
    7:50-8:10 BREAK. Please fill out this class evaluation.
    8:00-8:10 Helen Carey / Sriya Sarkar
    8:13-8:23 Gabe Weintraub
    8:26-8:36 Marijke Jorritsma
    8:39-8:49 Patrick Moraitis
    9:02-9:12 Anneka Goss
    9:12-9:20 Final Notes

    Announcements

    Next Week’s Assignments

    • Post all your midterm materials:
      • visualization
      • write-up
      • slides
      • link to data source
    • Ensure that your 5 blog posts (not including the midterm) are up.

    +
  • Announcements and Links Catherine D’Ignazio presentation Nat Geo: Data Visualization from Eye Candy to Efficiency NYU Libraries Data Services MIT Sloan Sports Analytics Conference NYC Media Lab Combine (incubator) Verizon Connected Futures Prototyping Challenge NYU Stern Conference on Big Data and Mobile Marketing New School Platform Coopertivism Internet and Democracy Conference…

    Announcements and Links

    Agenda

    • 6:30-7:30: Ekene Ijeoma lecture / Q&A
    • 7:30-8:00: Class updates
    • 8:00-8:10: Break
    • 8:10-8:55(ish): Kevin Miklasz lecture entitled “Function vs. aesthetics in data visualization: some case studies”
    • 8:55-9:20: Work time and feedback from guests

    Notes

    Next Week’s Assignments

    • Midterm, due next week. A full project brief is here. If it will help you prepare, I can write a walk-through of one of my previous visualization projects (which includes a write-up) that is about the same scope as a midterm or final, depending on your prior technical experience. You can see the project files for this vis, called “Networks in Equity and Sustainability,” at Github.
    • Make sure your 5 blog posts (not including the midterm) are up by next week. If you owe extra credit for a missed class(es), be sure to expand one of your assignments accordingly. Let me know if you have any questions about this.
    +
  • Announcements and Links Data Visualization from Eye Candy to Efficiency Agenda 6:30-7:15: Updates (including visualization assignments) 7:15-8:15: Catherine D’Ignazio lecture / Q&A 8:15-8:25: Break 8:25-9:15: Remote sensing (satellite data) lecture / Q&A. Guest co-speaker: Ilias Koen. 9:15-9:20: Final notes and comments Notes Remote sensing: Spectral, Temporal, Spatial Working with Landsat Data (2006)…

    Announcements and Links

    Agenda

    • 6:30-7:15: Updates (including visualization assignments)
    • 7:15-8:15: Catherine D’Ignazio lecture / Q&A
    • 8:15-8:25: Break
    • 8:25-9:15: Remote sensing (satellite data) lecture / Q&A.
      Guest co-speaker: Ilias Koen.
    • 9:15-9:20: Final notes and comments

    Notes

     

    Next Week’s Assignments

    • Blog post (due next week): Complete another brief visualization exercise. It can, but doesn’t have to, be related to your midterm project. It can follow one of the tutorials on this site, or it can follow an external tutorial. You can even dive deeper into one of the remote sensing-related tools like Google Earth Engine or GDAL. It’s up to you!
    • Midterm (due in two weeks). A full project brief is here. If it will help you prepare, I can write a walk-through of one of my previous visualization projects (which includes a write-up) that is about the same scope as a midterm or final, depending on your prior technical experience. You can see the project files for this vis, called “Networks in Equity and Sustainability,” at Github.
    +
  • Midterm, due October 15 Note: Keywords are highlighted in orange.  Develop an interactive visualization(s) of a large (5000+ records) dataset for a general audience with no previous knowledge of the data. If the visualization is online, it should be deployed on a standalone web page—do not use code playgrounds like…

    Midterm, due October 15

    Note: Keywords are highlighted in orange

    1. Develop an interactive visualization(s) of a large (5000+ records) dataset for a general audience with no previous knowledge of the data. If the visualization is online, it should be deployed on a standalone web page—do not use code playgrounds like JSFiddle or Codepen. If the visualization is not online, please send me the files or materials I need to run and evaluate the visualization offsite. The visualization should employ
      • keys, colors, annotations, and other relevant information to show and explain patterns in the data that can only be understood through your project.
      • introductory text and titles so that a general user has all the information she/he needs to engage with and understand the visualization without further explanation.
    2. Include a background write-up (at least 800 words on your blog is fine) with an MLA-formatted bibliography containing at least 3 scholarly articles. Be sure to address the following points in the write-up. I will evaluate your project based on these points:
      • How is your overall visualization significant, unique, and relevant to general audiences?
      • What is the data source? How is it significant?
      • What questions does your visualization help answer? What is the best medium for this representation? (i.e. poster, model, web page)
      • Please discuss prior work,  prototypes, or sketches for your visualization. It can be both others’  work (please cite them in your bibliography) and/or your own work. How have these predecessors refined your work for this project?
      • Is there a subset of the general audience that could especially use or appreciate your visualization? If so, why?
      • How is your visualization implemented? What technologies does it use?
      • How is your visualization meant to be used? What are the steps for a user to interact with your vis? 
      • Future Work: What are the next steps for refining this project?
    3. Create a slidedeck (of at least 6 slides) that explains the content in your write-up.
    4. Be prepared to give a 7-10 minute presentation of your slides and your visualization, followed by 5-7 minutes of Q&A.
    +
  • Announcements and Links CUNY Interactivity Conference (via Marijke): http://interactivity.splashthat.com/ Use the code “JSTUDENT” to get 30% off the ticket. New books Tutorial section NYC Data Science From Catherine Cramer: Presentation deck Teaching Network Literacy NetSci High Design Make Play book (and articles that cite it) Agenda 6:30-7:15: Updates (including visualization assignments) 7:15-8:00:…

    Announcements and Links

    Agenda

    • 6:30-7:15: Updates (including visualization assignments)
    • 7:15-8:00: Chris Willard lecture / Q&A
    • 8:00-8:10: Break
    • 8:10-8:45: Workshop
    • 8:45-9:15: Workshop Results and Updates
    • 9:15-9:20: Final notes and comments

    Notes

    • Visualization Criteria
      • What is an interesting data source?
      • Who is the audience you want to see this data?
      • What type of representation would serve this audience best (record, summary, process)?
      • What questions does this help answer?
      • What is the best medium for this representation? (poster, model, web page)
      • Sketch your vision for how it will work (harkening back to examples).
      • How will you implement your visualization (incl. finding starting points/examples)?

    Next Week’s Assignments

    • Blog Post 1: Finish NYSCI exercise documentation. Please include your Ideation sketches/pictures from the “big data for little kids” exercise, a short bibliography (using MLA citation format) of your scholarly resources, and reflections on the activity.
    • Blog Post 2: Post your results from today to start your midterm preparations. In addition to posting your sketches and preliminary answers to the questions above, please also post your progress on
      • team formation
      • data choice and aquision
      • technical requirements.
        I also encourage you to post another practice visualization if you have time; I will post two tutorials this week on bubble charts and force-directed bubbles.
    +
  • Full tutorial coming soon! In the meantime, check the JSFiddle below. This visualization shows the top 2011 NYC baby names by mother’s ethnicity. The vis is a variation on our static bubble chart, but with some fun multi-foci force direction added in. Background from the D3 API reference and more: quick…

    Full tutorial coming soon! In the meantime, check the JSFiddle below.

    This visualization shows the top 2011 NYC baby names by mother’s ethnicity. The vis is a variation on our static bubble chart, but with some fun multi-foci force direction added in.

    Background from the D3 API reference and more:

    This tutorial is based on our previous bubble chart tutorial and M. Bostock’s multi-foci force layout, as well as our bar chart tutorial, Bostock’s bubble chart example, and InfoCaptor’s Bubble My Page service.

    + ,
  • Full tutorial coming soon! In the meantime, check the JSFiddle below. This visualization shows the top 2011 NYC baby names by mother’s ethnicity. Background from the D3 API reference and more: bubble chart as a flattened pack layout the pack layout is one of D3’s hierarchy layouts This tutorial is based…

    Full tutorial coming soon! In the meantime, check the JSFiddle below.

    This visualization shows the top 2011 NYC baby names by mother’s ethnicity.

    Background from the D3 API reference and more:

    This tutorial is based on our previous bar chart tutorial, Bostock’s bubble chart example, and InfoCaptor’s Bubble My Page service.

    CSS

    We set up a text size and style for our vis.

    text {
     font: 10px sans-serif;
    }

    HTML

    We include D3.js for the graphics, and Jquery for the rollovers.

    <script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min.js"></script>
    <script src="http://code.jquery.com/jquery-latest.min.js"></script>

    Javascript

    First we set up the global variables for our visualization size, type, colors, and bubbles.

    var diameter = 700,
     format = d3.format(",d"),
     color = d3.scale.category20c();
    
    var bubble = d3.layout.pack()
     .sort(null)
     .size([diameter, diameter])
     .padding(1.5);
    
    var svg = d3.select("body").append("svg")
     .attr("width", diameter)
     .attr("height", diameter)
     .attr("class", "bubble");

    We then query NYC Open Data for the baby name dataset, and throw an error if this query is not successful.

    d3.csv("https://data.cityofnewyork.us/resource/25th-nujf.csv?$limit=10000", function (error, root) {
     if (error) throw error;

    Within the d3.csv function, we set up sub-arrays, divided by ethnicity, to record baby names and related attributes for each ethnicity type (which are pre-determined in the data set). We also set up arrays called data and dobj in which we will concatenate all the sub-arrays.

    //baby names
     var AsianNameList=[];
     var BlackNameList=[];
     var HispanicNameList=[];
     var WhiteNameList=[]; 
     
     //number of babies with name
     var AsianNameCount=[];
     var BlackNameCount=[];
     var HispanicNameCount=[];
     var WhiteNameCount=[]; 
     
     //ethnicity of mother
     var AsianEthnicity=[]; 
     var BlackEthnicity=[]; 
     var HispanicEthnicity=[]; 
     var WhiteEthnicity=[]; 
     
     var data=[]; //3D array of nameList, nameCount, ethnicity
     var dobj=[]; //array formated specifically for hierarchical processing

    We now examine all the names in the dataset and bin them into the sub-arrays. There are duplicate records in the dataset, so we only add the name if it’s not already in the array. We use the “indexOf” function to check if the name is already in the array.

    root.forEach(function (d) {
     if (+d["RNK"] <= 10) {
     
     //only add name if it's not uniquely in the array
     if ((d["ETHCTY"]==="ASIAN AND PACIFIC ISLANDER") && (AsianNameList.indexOf(d["NM"]) ===-1)) {
     AsianNameList.push(d["NM"]);
     AsianNameCount.push(+d["CNT"]); //force the string into an integer
     AsianEthnicity.push(d["ETHCTY"]);
     }
     else if ((d["ETHCTY"]==="BLACK NON HISPANIC") && (BlackNameList.indexOf(d["NM"]) ===-1)) {
     BlackNameList.push(d["NM"]);
     BlackNameCount.push(+d["CNT"]); //force the string into an integer
     BlackEthnicity.push(d["ETHCTY"]);
     }
     else if ((d["ETHCTY"]==="HISPANIC") && (HispanicNameList.indexOf(d["NM"]) ===-1)) {
     HispanicNameList.push(d["NM"]);
     HispanicNameCount.push(+d["CNT"]); //force the string into an integer
     HispanicEthnicity.push(d["ETHCTY"]);
     }
     else if ((d["ETHCTY"]==="WHITE NON HISPANIC") && (WhiteNameList.indexOf(d["NM"]) ===-1)) {
     WhiteNameList.push(d["NM"]);
     WhiteNameCount.push(+d["CNT"]); //force the string into an integer
     WhiteEthnicity.push(d["ETHCTY"]);
     }
     }
     });

    We then concatenate all this data into one array called data. There is some simple test data currently commented out; this can be helpful for if you are having trouble managing your larger dataset.

    data=[
     //test data is commented out:
     //["Tea","Coffee","Soda","Chips","Milk","Chocolate","Beer","Wine"],
     //[130,30,200,40,230,150,80,65]
     AsianNameList.concat(BlackNameList,WhiteNameList,HispanicNameList), 
     AsianNameCount.concat(BlackNameCount,WhiteNameCount,HispanicNameCount), 
     AsianEthnicity.concat(BlackEthnicity,WhiteEthnicity, HispanicEthnicity)
     ];

    We prepare an array called dobj to record the index of each name. We also run the function called display_pack.

    for (var di=0;di<data[0].length;di++) { 
       dobj.push({"key":di,"value":data[1][di]}); 
    } 
    
    display_pack({children: dobj});

    The function display_pack consists of three parts. First, we set up SVG nodes (bubbles): here we set up bubble position, color, and mouseover text.

    function display_pack(root) {
     var node = svg.selectAll(".node")
     .data(bubble.nodes(root)
     .filter(function(d) { return !d.children; }))
     .enter().append("g")
     .attr("class", "node")
     .attr("transform", function(d) { return "translate(" + d.x + "," + d.y + ")"; })
     .style("fill", function(d) { 
     //color is based on ethnicity
     return color(data[2][d.key]); })
     .on("mouseover", function(d) {
     d3.select(this).style("fill", "gold"); 
     showToolTip(" "+data[0][d.key]+"<br> "+data[2][d.key]+"<br>count: "+data[1][d.key]+" ",d.x+d3.mouse(this)[0]+50,d.y+d3.mouse(this)[1],true);
     //console.log(d3.mouse(this));
     })
     //.on("mousemove", function(d,i) {
     //tooltipDivID.css({top:d.y+d3.mouse(this)[1],left:d.x+d3.mouse(this)[0]+50});
     //}) 
     .on("mouseout", function() {
     d3.select(this).style("fill", function(d) { return color(data[2][d.key]); });
     showToolTip(" ",0,0,false);
     });
    
    

    Then size our bubbles.

    node.append("circle")
     .attr("r", function(d) { return d.r; });

    Finally, we add text to our bubbles.

    node.append("text")
     .attr("dy", ".3em")
     .style("text-anchor", "middle")
     .style("fill","black")
     .text(function(d) { return data[0][d.key].substring(0, d.r / 3); });

    The function showToolTip is called from display_pack, so we need to write showToolTip. And that’s it!

    function showToolTip(pMessage,pX,pY,pShow)
     {
     if (typeof(tooltipDivID)==="undefined") {
     tooltipDivID =$('<div id="messageToolTipDiv" style="position:absolute;display:block;z-index:10000;border:2px solid black;background-color:rgba(0,0,0,0.8);margin:auto;padding:3px 5px 3px 5px;color:white;font-size:12px;font-family:arial;border-radius: 5px;vertical-align: middle;text-align: center;min-width:50px;overflow:auto;"></div>');
    
     $('body').append(tooltipDivID);
     }
     if (!pShow) { tooltipDivID.hide(); return;}
     //MT.tooltipDivID.empty().append(pMessage);
     tooltipDivID.html(pMessage);
     tooltipDivID.css({top:pY,left:pX});
     tooltipDivID.show();
     }
    + ,
  • Bar charts are some of the simplest forms of visualization and can be a good place to start when making your first vis. Inspired by some of the Socrata and D3 examples, I put together a simple chart showing the daily distribution of my dataset from the SODA tutorial (311 noise complaints, in zip…

    Bar charts are some of the simplest forms of visualization and can be a good place to start when making your first vis. Inspired by some of the Socrata and D3 examples, I put together a simple chart showing the daily distribution of my dataset from the SODA tutorial (311 noise complaints, in zip code 11231, for September 2015). As you can see, most complaints occur on the weekend.

    This chart is based on Mike Bostock’s simple bar chart tutorial, with modifications made to read the relevant NYC Open Data. Like with many of these types of visualizations, its code is written as CSS, HTML, and Javascript. This tutorial assumes proficiency with CSS and HTML, and a basic understanding of Javascript.

    CSS

    In the style section, we set up the fills, strokes, fonts, and shape rendering styles for our bars and axes. We also specify that there will be no x-axis line.

    .bar {
     fill: orange;
    }
    
    .bar:hover {
     fill: lightpink;
    }
    
    .axis {
     font: 10px sans-serif;
    }
    
    .axis path,
    .axis line {
     fill: none;
     stroke: #000;
     shape-rendering: crispEdges;
    }
    
    .x.axis path {
     display: none;
    }

    HTML

    Our HTML body contains only two lines to include the Javascript libraries D3.js and Moment.js. And of course, it contains our custom JS functions, but we’ll discuss that below.

    <script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min.js"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.10.6/moment.min.js"></script>

    Javascript

    First, we declare and initialize our spatial variables: the margins, width, and height of the vis; the x and y positioning variables, the x-axis and the y-axis, and the SVG (scalable vector graphic) into which we’ll draw our vis.

    Note: I’ll add more detail later about the D3 functions used to initialize these variables, and about the D3 functions used throughout the script. In the meantime, you can start with this info, and you can also drill down into the D3 documentation from Mike Bostock’s simple bar chart tutorial.

    var margin = {top: 20, right: 20, bottom: 30, left: 40},
     width = 960 - margin.left - margin.right,
     height = 500 - margin.top - margin.bottom;
    
    var x = d3.scale.ordinal()
     .rangeRoundBands([0, width], .1);
    
    var y = d3.scale.linear()
     .range([height, 0]);
    
    var xAxis = d3.svg.axis()
     .scale(x)
     .orient("bottom");
    
    var yAxis = d3.svg.axis()
     .scale(y)
     .orient("left")
     .ticks(10);
    
    var svg = d3.select("body").append("svg")
     .attr("width", width + margin.left + margin.right)
     .attr("height", height + margin.top + margin.bottom)
     .append("g")
     .attr("transform", "translate(" + margin.left + "," + margin.top + ")");

    Our remaining bit of code inputs the NYC Open Data csv, parses it, processes it, and visualizes it.

    Input it

    Now we’ll call D3’s CSV function to input our NYC data. Notice that my query from Tutorial 1 has an addition filter: “&$select=created_date.” I added this filter to output ONLY the created_date for each record, because this is the only field that concerns me for this visualization. If there is an issue with the data being read, the “if (error) throw error” will halt the script with an error code.

    d3.csv("https://data.cityofnewyork.us/resource/erm2-nwe9.csv?$where=starts_with(complaint_type,'Noise') AND created_date >='2015-08-01T00:00:00' AND incident_zip='11231' &$select=created_date", function (error, data) {
        if (error) throw error;

    Parse and Process it

    Now we’ll set up an array “week” to represent Sunday (represented as week[0]) through Saturday (represented as week[6]). We will iterate through our NYC dataset with a “forEach” loop, and use the array to count the occurrence of noise complaints for each day.

    This section of code is where we parse each date using Moment.js. Because NYC Open Data is NOT outputting dates in ISO 8601 format, we use Moment to parse the date. We also use Moment’s “day()” function to interpret that date as a day of the week.

    var week = [0, 0, 0, 0, 0, 0, 0]; 
    
    data.forEach(function (d) {
    
    if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 0) {
     week[0]++;
     } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 1) {
     week[1]++;
     } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 2) {
     week[2]++;
     } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 3) {
     week[3]++;
     } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 4) {
     week[4]++;
     } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 5) {
     week[5]++;
     } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 6) {
     week[6]++;
     }
     });

    I threw in a few debug statements to check the contents of the final array, as well as check the day with the largest number of complaints. If you’re modifying this tutorial with your own data, these debug statements may be useful to you.

     console.log("week:");
     for (index = 0; index < week.length; index++) {
     console.log(week[index]);
     }
     console.log("most complaints in a day:", Math.max.apply(Math, week));

    Note: If you’re using Chrome and running the code for this bar chart as a browser page, you can see the output for these statements in View > Developer > Javascript Console. Firefox and Safari have similar Javascript consoles.

    Visualize it!

    Now it’s time to visualize the array. First, we’ll set up our domains, or the array of values that our axes must span. Domain values will be displayed as labels on our axes.

    • For the y-axis, the domain is from 0 to the maximum value of complaints aggregated from a given day of the week.
    • For the x-axis, the domain is the days of the week.
    y.domain([0, Math.max.apply(Math, week)]);
    x.domain(["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]);

    Next we will draw the x-axis.

     svg.append("g")
     .attr("class", "x axis")
     .attr("transform", "translate(0," + height + ")")
     .call(xAxis);

    Now we’ll draw the y-axis. We will label this axis “Number of Complaints.”

      svg.append("g")
     .attr("class", "y axis")
     .call(yAxis)
     .append("text")
     .attr("transform", "rotate(-90)")
     .attr("y", 6)
     .attr("dy", ".71em")
     .style("text-anchor", "end")
     .text("Number of Complaints");

    Finally, we will draw a vertical bar for each day of the week. We iterate through the array week for this process.

    • To determine the value of x, we use the index of week to look of the corresponding value of x.domain (which will be Sunday or another day of the week). The position of x is set by its “rangeBand,” which we set up early on in our variable declarations.
    • The determine the value of y, we simply take the value of week‘s element. The position of y is set by the total height of the vis minus the value of y.
      svg.selectAll(".bar")
     .data(week)
     .enter().append("rect")
     .attr("class", "bar")
     .attr("x", function (d, i) {return x(x.domain()[i]);})
     .attr("width", x.rangeBand())
     .attr("y", y)
     .attr("height", function (d) {return height - y(d);});

    Checking and Troubleshooting

    You can fork and fiddle this vis using JSFiddle. Because I will require a standalone file from you for the midterm, it might be helpful to see the vis as a standalone HTML file. You can copy the file to your own server and modify the code with your own data. From there, you can debug and see console statements using Chrome > View > Developer > Javascript Console (if you’re using Chrome).

    + ,