import React from "react";
// import "./sources.css";

export class Sources extends React.Component {
	render() {
		return (
			<div className={"grid"}>
				<div className = "topnav">
				<a href="#top"><img className = "banner-logo" src="/maplogo.png" alt="Technology Policy Institute Logo"/></a>
					{/* <button id="pdf"> Export PDF</button> */}
					<a id = "link" href="/#scatter">Data Correlation</a>
					<a id = "link" href="/#top">Map and Time Series</a>
					<a id = "link" href="/sources">Dataset Descriptions and Discussion</a>
					<a id = "link" href="https://tpibroadband.tribeplatform.com/">User Forum</a>
				</div>
				<div className = "source-content">
					<h1>Datasets Currently Available in TPI Broadband Map</h1>
					<p><a href="#availability">Availability</a></p>
					<p style={{"margin-left": "40px"}}><a href="#form477">Form 477 Data</a></p>
					<p><a href="#adoption">Adoption</a></p>
					<p style={{"margin-left": "40px"}}><a href="acs">American Community Survey</a></p>
					<p style={{"margin-left": "40px"}}><a href="speedtest">Speed Tests</a></p>
					<p style={{"margin-left": "80px"}}><a href="#ookla">Ookla</a></p>
					<p style={{"margin-left": "80px"}}><a href="#microsoft">Microsoft Speed Threshold</a></p>
					<p><a href="#igitaldivide">Data from Initiatives to Address the Digital Divide</a></p>
					<p><a href="#rdof">Rural Development Opportunity Fund (RDOF)</a></p>
					<p style={{"margin-left": "40px"}}><a href="#ebb">Emergency Broadband Benefit (EBB) Program</a></p>
					<h2 id="availability">Availability</h2>
					<p>Availability refers to where broadband is available, but not the extent to which people subscribe to it.</p>
					<h3 id="form477">Form 477 Data</h3>
					<p>ISPs report their coverage area to the FCC on “<a href="https://www.fcc.gov/economics-analytics/industry-analysis-division/form-477-resources">Form 477</a>,” which gives the dataset its name. All facilities-based fixed and terrestrial mobile ISPs <a href="https://us-fcc.app.box.com/v/WhoMustFileForm477">must file</a> broadband data twice per year through this form. The public version of the data contained in the TPI map <a href="https://www.fcc.gov/general/explanation-broadband-deployment-data">includes</a>:
						<ul>
							<li>15-digit Census Block</li>
							<li>Type of technology used to offer service</li>
							<li>Maximum advertised download speed in Mbps</li>
							<li>Maximum advertised upload speed in Mbps</li>
						</ul>
						The FCC began <a href="https://docs.fcc.gov/public/attachments/FCC-19-79A1.pdf">collecting data on Form 477</a> in 2000. Providers must include information on data connections offering advertised speeds of 200 kbps or higher to “end user premises.” Note this requirement does not affect any definition of broadband or imply that the FCC considers 200 kbps to be “broadband,” “high speed internet,” or any qualitative description of the data transfer. Instead, the data make it possible to calculate which Census Blocks have access to data connections at any given definition.
					</p>
					<h4>Pros</h4>
					<ul>
						<li>Data can be compared over time to understand trends. This ability to show trends in availability is because the FCC has used a similar approach to collecting the information since the collection began in 2000.</li>
						<li>The data are available at relatively small geographic areas (Census Blocks). With more than 11 million Census Blocks, the data provide a fairly granular picture on availability. For most policy analysis this level of granularity is sufficient or even more disaggregated than necessary, particularly given other datasets needed to add value to this data.</li>
					</ul>
					<h4>Cons</h4>
					<ul>
						<li>It overstates coverage. The problem arises because the FCC considers a Census Block “covered” if at least one provider provides service to at least one household (or business) in the entire Census Block. </li>
						<li>It can’t identify specific geographic areas without coverage. It is useful as a starting point to identify areas in which to look more closely, but using pre-defined geographic regions -- although important for conducting statistical analyses -- combined with the way in which the FCC defines “covered” means it is not possible to draw areas that are not covered.</li>
					</ul>
					<h4>Good to Know</h4>
					<p>The <a href="https://www.commerce.senate.gov/2020/3/bill-to-improve-broadband-data-maps-signed-into-law">Broadband Deployment Accuracy and Technological Availability</a> (DATA) Act, passed in March 2020, is intended to address both of the problems listed above. The FCC’s <a href="https://www.fcc.gov/BroadbandData/bdtf">Broadband Data Task Force</a> is leading the effort to collect far more granular data on availability. TPI’s map will include this new data as soon as it is available.</p>
					<p>The public data from Form 477 provides a mapping of broadband availability, not adoption. The FCC also collects <a href="https://us-fcc.app.box.com/v/ChangesFor2019and2020">data on the number of subscribers</a> (adoption) by Census Tract, but that information is not publicly available as raw data. Instead, data on adoption comes from household surveys conducted by the U.S. Census.</p>
					<p>
					The FCC generates several reports based on Form 477 data:
					<ul>
						<li><a href="https://www.fcc.gov/reports-research/reports/broadband-progress-reports">Broadband Progress Reports</a></li>
						<li><a href="https://broadbandmap.fcc.gov/#/">Standalone map of most recent Form 477 fixed broadband</a> connections</li>
					</ul>
					<p><b>Data Source</b>: <a href="https://www.fcc.gov/general/broadband-deployment-data-fcc-form-477">https://www.fcc.gov/general/broadband-deployment-data-fcc-form-477</a></p>
					</p>
					<h2 id="adoption">Adoption</h2>
					<p>Adoption refers to who is connected to the internet. As such, it reflects an intersection of supply and demand.</p>
					<h3 id="acs">American Community Survey</h3>
					<p>Data on adoption comes from the <b>U.S. Census <a href="https://www.census.gov/programs-surveys/acs">American Community Survey</a> (ACS)</b>. It is released annually and includes information on whether a household has subscribed to an internet connection and the type of the connection. </p>
					<p>The Census Bureau collects data at the household level, but for confidentiality reasons, releases it only in aggregated form. It is available by state, county, congressional district, census tract, zip code, combined statistical area, metropolitan statistical area, and incorporated/census designated place. </p>
					<p>Data on broadband adoption derive from <a href="https://www2.census.gov/programs-surveys/acs/methodology/questionnaires/2019/quest19.pdf">two key questions on the ACS</a>: 
						<ul>
							<li>At this house, apartment, or mobile home - do you or any member of this household have access to the Internet?</li>
							<li>Do you or any member of this household have access to the Internet?</li>
						</ul>
					</p>
					<img src="/acs.png" alt="American Community Survey Questionare" />
					<h4>Pros</h4>
					<p>The ACS also includes detailed demographic data for surveyed households. This demographic data allows a small number of questions on broadband to yield considerable information. For each question about broadband, we can, in principle, see breakdowns by age, income, number of children, education, and more. </p>
					<h4>Cons</h4>
					<p>Not all of the broadband questions are available for all types of demographics. For example, we can calculate across all types of households at any geographic level the share of households with broadband access and what type of technology they use: fiber, cable, DSL, and mobile. But it is possible to differentiate demographics based only on whether the household has internet, not based on what kind of internet it has. Given that by 2018 <a href="https://www.census.gov/content/dam/Census/library/publications/2021/acs/acs-49.pdf">only about 0.2%</a> of households had only a dialup connection, we can reasonably assume the variable measuring “internet” access is almost entirely broadband.</p>
					<p>For confidentiality reasons, data is available to the public down to the Census Tract, not Block, level. One implication is that combining it with the Form 477 data requires aggregating the 477 data which is available by Census Block up to some comparable geographic level, such as Census Tract or county.</p>
					<h4>Good to Know</h4>
					<p>The ACS data for broadband come in two flavors: <a href="https://www.census.gov/programs-surveys/acs/guidance/estimates.html">1-year and 5-year</a>. The 1-year dataset includes data collected over the previous 12 months. The 5-year dataset is an average of the previous 60 months. The 1-year data is the better source when looking for year-to-year changes. However, because of the smaller sample size, the 1-year data is released only for areas with populations greater than 65,000 and cannot include all demographics. The 5-year dataset has sufficient observations to investigate all the geographic areas listed above as well as all the household demographic information collected.</p>
					<p><b>Data Source</b>: <a href="https://www.census.gov/acs/www/data/data-tables-and-tools/subject-tables/">https://www.census.gov/acs/www/data/data-tables-and-tools/subject-tables/</a></p>
					<h3 id="speedtest">Speed Tests</h3>
					<p>Unlike the FCC’s Form 477 availability data, which shows maximum advertised speeds or the ACS adoption data, which notes whether a household has subscribed to broadband, speed test data shows the bandwidth that consumers actually receive. The bandwidth that consumers receive is a function of the subscription tier a consumer purchases and factors across the network affecting data flows to the end-user. As such, like adoption, speed tests are a measure of the intersection of supply and demand.</p>
					<h3 id="ookla"><a href="https://www.ookla.com/">Ookla Open Data</a></h3>
					<p>Ookla <a href="https://www.ookla.com/about">describes itself</a> as “the global leader in mobile and broadband network intelligence, testing applications and technology. Speedtest®, Ookla's flagship network testing platform, collects hundreds of millions of measurements about the performance and quality of networks around the world each day.” Ookla makes available average datafixed and mobile speeds at the Census Tract and larger geographic levels each quarter.</p>
					<h4>Pros</h4>
					<p>Ookla is based on an enormous number of tests--as of September 3, 2021, Ookla notes that it has run nearly 38 billion tests globally. The company updates its data frequently and is <a href="https://www.speedtest.net/insights/blog/how-ookla-ensures-accurate-reliable-data-2020/">transparent about</a> its testing methodology </p>
					<h4>Cons</h4>
					<p>The data shows the results of people who choose to run the speedtest.  This self-selection means that the results may not be representative. However, we do not know whether this biases the estimates upwards if, for example, people who subscribe to fast speeds like to check that they are getting what they expect, or downwards if, for example, people experiencing connection problems check their speeds because the connection seems slow.</p>
					<h4>Good to Know</h4>
					<p><i>Speed tests cannot, by themselves, determine whether households receive the speeds to which they subscribe.</i> Without knowing the bandwidth a household has decided to purchase, speed test data cannot reveal the difference between what a household should expect and what is available to that household. The FCC has attempted to address this question through its <a href="https://www.fcc.gov/general/measuring-broadband-america">Measuring Broadband America efforts</a>. (TPI’s Broadband Map will soon include this data.)</p>
					<p>Speedtest by Ookla Global Fixed and Mobile Network Performance Maps was last accessed on January 8, 2023 from https://registry.opendata.aws/speedtest-global-performance. Speedtest® by Ookla® Global Fixed and Mobile Network Performance Maps. Based on analysis by Ookla of Speedtest Intelligence® data for Q1 2019-Q4 2022. Provided by Ookla and accessed January 8, 2023. Ookla trademarks used under license and reprinted with permission.</p>
					<p>The average speed subscribers receive in any given geographic region will never equal the maximum available unless everyone in the region subscribes to the fastest tier. For example, the image below shows Comcast offering three speeds in the Washington, D.C. area (as of August, 2021): 200, 400, and 1200 Mbps. When the FCC releases data reflecting today’s conditions, maximum available will be at least 1200 Mbps in this area, but speed tests will show much lower averages because many people will choose not to spend an extra $40/month for a gigabit of additional speed.</p>
					<img src="/comcast.png" alt="Comcast Plans for Washington DC" />
					<p><b>Data Source</b>: <a href="https://www.ookla.com/ookla-for-good/open-data">https://www.ookla.com/ookla-for-good/open-data</a></p>
					<h3 id="microsoft">Microsoft Speed Threshold</h3>
					<p>Microsoft makes public data on the share of households whose connections meet or exceed the FCC’s definition of “broadband” (25 Mbps downstream, 3 Mbps upstream). Microsoft <a href="https://github.com/microsoft/USBroadbandUsagePercentages">says</a> it measures this share "by combining data from multiple Microsoft services” at county and zip code geographic levels. “The data from these services are combined with the number of households per county and zip code. Every time a device receives an update or connects to a Microsoft service, we can estimate the throughput speed of a machine. We know the size of the package sent to the computer, and we know the total time of the download. We also determine zip code level location data via reverse IP. Therefore, we can count the number of devices that have connected to the internet at broadband speed per each zip code based on the FCC’s definition of broadband that is 25mbps per download.” </p>
					<h4>Pros</h4>
					<p>Microsoft’s dataset is unique and creative and shows how existing resources can be leveraged to provide more information. It <a href="https://github.com/microsoft/USBroadbandUsagePercentages/blob/master/assets/Broadband_usage_differential_privacy_paper.pdf">demonstrates</a> how “differential privacy” can be used to aggregate user data in a way that makes it useful for policy while protecting users’ privacy. It can be used to track trends.</p>
					<h4>Cons</h4>
					<p><i>The measurement tool is intended for a different purpose.</i>  Microsoft obtains the measurement from a household when Microsoft is delivering a product update or a product is contacting Microsoft for other, unspecified, reasons. The firm’s servers are presumably configured to most efficiently distribute product updates or perform whatever service is required. This measure may take into account the end user’s connection speed, but certainly takes into account a large number of other factors. Dedicated speed tests, like Ookla, are designed explicitly to measure speeds. </p>
					<p><i>The testing methodology is unclear</i>. Microsoft does not, for example, note which Microsoft services it includes when testing connections, whether it uses a single or multiple TCP threads, where its servers are located relative to the end-user receiving the updates, and whether it takes into account a user’s <a href="https://www.windowscentral.com/how-limit-foreground-downloads-bandwidth-windows-10-april-2018-update">choice to throttle</a> foreground updates. Each of those may affect the measured speed. For example, using a single TCP thread is likely to <a href="https://ecfsapi.fcc.gov/file/1083088362452/fcc-17-108-reply-aug2017.pdf">bias downwards</a> the measured speed by <a href="https://samknows.one/hc/en-gb/articles/115003164305-What-is-the-difference-between-Single-and-Multi-Thread-">not taking</a> into account the full available bandwidth.</p>
					<p>A minor issue is that the description of the dataset suggests that it is collected at the household level, but Microsoft discusses its own data as shares of the total U.S. population. Microsoft may have used the average number of people per household to obtain its population share estimates, but does not say so, leaving the exact measurement they use somewhat unclear.</p>
					<p>An issue that may be major or minor is that Microsoft’s measure is the number of households receiving updates at “broadband speeds,” divided by the total number of households in the county or zip code. A better denominator would be the number of households that receive Microsoft updates. Whether this problem is large or tiny depends on which products Microsoft includes in its measures.</p>
					<h4>Good to Know</h4>
					<p><i>The data should be used carefully</i> until Microsoft makes its methodology more transparent. If its methodology is consistent across time periods, then it can be useful for tracking trends regardless of the accuracy of reported shares of households, but the error built into the measures is unknown. </p>
					<p><b>Data Source</b>: <a href="https://github.com/microsoft/USBroadbandUsagePercentages">https://github.com/microsoft/USBroadbandUsagePercentages</a></p>
					<h2 id="digitaldivide">Data from Initiatives to Address the Digital Divide</h2>
					<p>The U.S. has many subsidy programs intended to bridge different parts of the digital divide. Some, like most in the Universal Service Fund, have operated since the 1990s. Others are much newer.</p>
					<h3 id="rdof">Rural Development Opportunity Fund (RDOF)</h3>
					<p><a href="https://www.fcc.gov/auction/904">RDOF</a> is an FCC reverse auction concluded in December 2020 designed to allocate subsidies for broadband buildout. Internet Service Providers bid for subsidies to build and provide service in areas identified as being unserved. The TPI Broadband Map includes the amount of money won for buildout by state, county, and Census Block, as well as Census Blocks the FCC identified as unserved and those that will receive support following the auction.</p>
					<h4>Pros</h4>
					<p>The list of unserved Census Blocks is probably the most accurate data on truly unserved areas due to the identification and <a href="https://www.costquest.com/resources/rdof/challenge/">challenge</a> process the FCC used.</p>
					<h4>Cons</h4>
					<p>Even with the challenge process, some of the included Census Blocks turned out to be places that should not have been included.</p>
					<p><b>Data Source</b>: <a href="https://www.fcc.gov/auction/904">https://www.fcc.gov/auction/904</a></p>
					<h3 id="ebb">Emergency Broadband Benefit (EBB) Program</h3>
					<p>The <a href="https://www.fcc.gov/broadbandbenefit">EBB</a> is a Covid relief program that provides up to $50 per month ($75 on tribal lands) to eligible households and a one-time $100 discount on equipment. The subsidies will continue until the program’s $3.2 billion has been spent.</p>
					<h4>Pros</h4>
					<p>Weekly updates on the number of subscribers in each state provide some real-time insights into how the money is being spent.</p>
					<h4>Cons</h4>
					<p>Weekly data is available only at the state level. The FCC releases data on the number of participants at the ZIP3 level (a ZIP3 area includes all 5 digit zip codes that share the same first 3 digits) only monthly, and shows total spending without any geographic breakdown monthly.</p>
					<p><b>Data Source</b>: <a href="https://www.usac.org/about/emergency-broadband-benefit-program/emergency-broadband-benefit-program-enrollments-and-claims-tracker/">https://www.usac.org/about/emergency-broadband-benefit-program/emergency-broadband-benefit-program-enrollments-and-claims-tracker/</a></p>
				</div>
			</div>
		)
	}
}