class: center, middle, inverse, title-slide # Data Analysis ## in the context of wildlife ecology and management ### Joe Thorley Ph.D. R.P.Bio. ### 2018-10-29 --- # Introduction In this talk we will cover -- - What Data Analysis is -- - What Uncertainty is -- - What Data are -- - What Statistical Models are --- # What is Data Analysis? -- Data analysis is about -- - reducing uncertainty (question) -- - using information (observations/measurements -> data) -- - given assumed relationships (model) --- class: inverse background-image: url(/slides/18-analysis_files/bison-1581895_1920.jpg) # Uncertainties in Wildlife Management -- The main uncertainties are -- - How many individuals are there? -- - How are they distributed through space? -- - How are they changing through time? -- - What factors are responsible for the distribution/changes? --- background-image: url(/slides/18-analysis_files/bison-2237654_1920.jpg) background-size: contain -- Factors can be environmental and anthropogenic. --- layout: true # Uncertainty is measured using Probability --- background-image: url(/slides/18-analysis_files/09_1987_1dollar_rev.jpg) background-size: contain --- <!-- --> --- <!-- --> --- <!-- --> --- <!-- --> --- layout: false # A tossed coin - What is the probability of a heads before I toss the coin? -- - What is the probability after I toss it but before I look at it? -- - What is the probability after I look at it but before I tell you? -- - What is the probability after I tell you its tails? -- - What is the probability if I then tell you I was lying and its heads? -- - What is the probability if I'm willing to bet you $10 its heads? -- - Are you willing to accept odds of 10:1? --- # PROBABILITY DOES NOT EXIST -- De Finetti (1970) Theory of Probability -- - Probability is a measure of **your** uncertainty about the possible states of the world. -- <!-- --> --- layout: true # How many bison are there? --- <!-- --> --- <!-- --> --- <!-- --> --- <!-- --> --- layout: false # Information reduces uncertainty - How much information is conveyed by being shown a coin is heads? -- `$$-\log_2(1/2) = 1\ \text{bit}$$` -- - How much information is conveyed by being shown a two headed coin is heads? -- `$$-\log_2(1/1) = 0\ \text{bits}$$` -- - What about a random number generator? -- It depends on the number of possible values. -- If there are 256 possible values then `$$-\log_2(1/256) = 8\ \text{bits} = 1\ \text{byte}$$` -- It also depends on the generator... ---  -- Not all generators are created equal! --- background-image: url(/slides/18-analysis_files/bison-1801981_1920.jpg) ## Data is Coded Information -- > At easter about an hour after midday I was walking by the river and I saw a female bison, and another female behind it and another male one behind it. I went back at sunset but didn't see anything. -- "04/01/18 1:05 two female and one male bison 44.4280 N 110.5885 W" --- background-image: url(/slides/18-analysis_files/bison-1801981_1920.jpg) ## Data is machine-readable
-- This dataset has at least five problems! -- -- - Numbers are mispunched -- - Two different observers -- - 4th of January -- - Just after middnight -- - In the Gobi Desert -- - Missing zero count! -- - Didn't record whether calves! -- - I am not trained to sex bison! -- *80% of data analysis is data cleansing and tidying!!* --- ## Data should be stored in relational databases
--
--- # What are we able/willing to assume? -- - All individuals were at the river at middday on April 1st and observer efficiency is 100%? -- Then there are 3 bison! -- - Bison are distributed randomly/evenly throughout their range? -- - Bison groups spend 50% of their time near water? -- - Bison aggregate in groups of 3 to 10 individuals? -- - Observer efficiency is 50%? -- - Observer efficiency varies between 40% and 60%? -- A model is one or more relationships. -- Such relationships can be certain or uncertain. --- # All models are wrong --- # All models are wrong (but some are useful) -- > The map is not the territory Alfred Korzybski -- > Everything simple is false. Everything which is complex is unusable. Paul Valery -- > There is no such thing as an absolutely True thought. Steven Gray --- layout: true # Confidence/Credible Intervals --- Uncertainty can be summarized using confidence/credible intervals. -- <!-- --> --- Uncertainty is typically represented on the y-axis. -- <!-- --> --- layout: false # Statistical Significance -- Significance is often used to test whether a factor has an effect. -- <!-- --> --- # Significance is not Importance! <!-- --> -- With no data all effects are insignificant! -- With infinite data all effects are significant! --- # Report Effect Sizes <img src="/slides/18-analysis_files/scalf.png" style="width: 50%" /> --- # Summary -- - Uncertainty is personal -- - Clean and tidy data is essential -- - Models are inescapable -- - Effect sizes are needed for decision-making --- class: center, middle # Thanks! For more information see www.joethorley.io www.poissonconsulting.ca