Sunday, 29 March 2015

Laser Tag Rating System

This is a rough outline of an idea that Phil Ophus and I had. We want a rating system for laser tag. We want it to be a cumulative reflection of a player's skill over multiple games. We want a means to compare the laser tag skill of players that have a substantial play record, but not necessarily against each other in the same match. In short, we want an Elo or Trueskill style metric that is a shorthand for "this player is X skilled at laser tag".

However, laser tag has less standardization than the sports and games that these metrics are usually applied to.


As it is now, the single game score cards don't give any context of a skill outside of the single match displayed. Scores in general are higher in matches with many players and longer 'ironman' matches. A score of 50000 can demonstrate just as much achievement as a score of 100000 against equally skilled opponents by these factors alone.


There are strategy factors that affect score that don't fairly reflect skill such as picking mostly weak targets, and aggressive running around play rather than base protection. Besides within-game luck, other random noise is added in from equipment effects; some guns are in better shape than others.


Scores are naturally different between free for all and team games, and some players do better in different formats. However, it's reasonable to assume a strong enough association between the skill levels of a given player across formats that one format can inform the other.

All of this variation, and this is only from a single location: Planet Lazer near Braid Station, New Westminster. At this location, the scoring system rewards hitting much more than it penalizes being hit. Also, every target on a vest is worth the same amount, although this isn't necessarily true at other locations.

We want a ratings method that can be used to compare players in different arenas that may be using different rules. Ideally something anyone could see how well they stack up on a regional up to a worldwide level. However even if we only use places that use comparable equipment, the arenas are substantially different, whereas in many other sports the arena effect is negligible. The rules and scoring systems even differ from place to place. 

Our intuition and short train ride's worth of literature searching suggest that no such system exists yet that can handle the non uniform, free for all situations of laser tag. I'm hoping that further developing the cricket similalulator to handle data of cricket players that are compete in multiple formats for multiple teams in a single year.        

On the sampling design and logistics side, there are issues with data collection. What if a location's equipment doesn't record a key variable? How long is data retained? Are there enough return players? It seems like the next step is to draw up a research proposal, and bring it to planet Lazer and see if they would let us record their player data like that.

For after the thesis, if at all.

Sunday, 15 March 2015

R Packette - Weighted Binomial Sums

This R code file Weighted_BinSum.r is a... packette? Proto-package? Mini-package?

It's by no means a full R package because it doesn't have the proper documentation, but it's the start of what could be one.

This packette is titled "Exact and Approximate Distributions of Weighted Binomial Sums", it's used produce a distribution of sum( w * binomial(n,p)) where n and p are the usual binomial parameters and w values are arbitrary positive weights. It also has functions to work with the resultant distribution (pbinsum, qbinsum, dbinsum, and rbinsum)  much as you would any of the classical distributions.

The function to compute the sum distribution, make.binsum( ... ), uses an operational parameter, "tol" for tolerance, which limits the number of possible sum values that will be evaluated. If the anticipated number of distinct values the weighted sum can take exceeds this tolerance value, values are rounded off and grouped together in "tol" even spaced increments, and a flag that distribution is approximate, "is.approx", becomes set to true. If such an approximation is never necessary, the exact distribution is computed instead.  The default tolerance is 1 million for weighted binomial sums, and 100 million when the weights are all equal.

Another operational parameter, "batch_size", is used to control the number of possible (non-distinct) sum values that are considered at once. Larger batch sizes reduce the computation time but require more memory. The default batch size, a soft limit, is 1 million and should limit to RAM footprint of this function to a couple hundred Mb at worst. Garbage collection gc() is called to ensure the RAM usage is temporary.

The default output of make.binsum( ... ) is a list including $x, a vector of the possible sum values with estimated probability mass, $pdf and cumulative probability $cdf, as well as the flag $is.approx and the input parameters $n, $p, and $w. By setting the operational parameter out_df in make.binsum( ... ), you can get a data frame of $x, $pdf, and $cdf instead. Both the data frame and list work for the p,q,r, and dbinsum functions.


Examples:

source("Weighted_BinSum.r")
 n = c(50,100,150,200,250)
 p = c(0.1,0.2,0.3,0.4,0.5) 
 w = c(1,1,1,1,1,1)
distr = make.binsum(n=n,p=p,w=w)
str(distr)

List of 7
 $ x        : num [1:751] 0 1 2 3 4 5 6 7 8 9 ...
 $ pdf      : num [1:751] 1.44e-155 6.89e-153 1.64e-150 2.61e-148 3.11e-146 ...
 $ cdf      : num [1:751] 1.44e-155 6.90e-153 1.65e-150 2.63e-148 3.13e-146 ...
 $ n        : num [1:5] 50 100 150 200 250
 $ p        : num [1:5] 0.1 0.2 0.3 0.4 0.5
 $ w        : num [1:5] 1 1 1 1 1
 $ is.approx: logi FALSE


And as a data frame



distr = make.binsum(n=n,p=p,w=w,out_df=TRUE)
str(distr)

'data.frame':   751 obs. of  3 variables:
 $ x  : num  0 1 2 3 4 5 6 7 8 9 ...
 $ pdf: num  1.44e-155 6.89e-153 1.64e-150 2.61e-148 3.11e-146 ...
 $ cdf: num  1.44e-155 6.90e-153 1.65e-150 2.63e-148 3.13e-146 ...


And some query functions (either as a data frame or as a list)

 ## Get the CDF
 pbinsum(c(275,283,291,296,300,305,311,315,320,326),distr)
 [1] 0.5167767 0.7480498 0.9019275 0.9536964 0.9768502 0.9913582 0.9977820 0.9991968
 [9] 0.9998006 0.9999688

 ## Trace CDF back to quantiles
 temp = pbinsum(c(275,283,291,296,300,305,311,315,320,326),distr)
 qbinsum(temp, distr)
 [1] 275 283 291 296 300 305 311 315 320 326


 temp = pbinsum(c(275,283,291,296,300,305,311,315,320,326)+0.5,distr)
 qbinsum(temp, distr) # Some error introduced by linear interpolation
 [1] 275.4991 283.4868 291.4748 296.4673 300.4613 305.4539 311.4451 315.4393 320.4321
[10] 326.4234

 ## Get the probability mass
 dbinsum(c(275,283,291,296,300,305,311,315,320,326),distr)
 [1] 3.128514e-02 2.558020e-02 1.417002e-02 8.051091e-03 4.597739e-03 1.995477e-03
 [7] 6.023298e-04 2.407634e-04 6.700921e-05 1.189357e-05

 ## Randomly select values (with replacement) based on the PMF
 rbinsum(10,distr)
 [1] 279 279 269 270 285 274 277 275 266 266


Finally, a weighted case, which takes more time and ram, produces much larger objects, and is sometimes approximate.

n = c(50,100,150,200,250)
p = c(0.1,0.2,0.3,0.4,0.5) 
w = c(1,1.2,1.3,1.67,2)
distr = make.binsum(n=n,p=p,w=w)

 str(distr)
List of 7
 $ x        : num [1:117967] 0 1 1.2 1.3 1.67 ...
 $ pdf      : num [1:117967] 1.44e-155 8.00e-155 1.32e-147 4.36e-125 2.94e-90 ...
 $ cdf      : num [1:117967] 1.44e-155 9.44e-155 1.32e-147 4.36e-125 2.94e-90 ...
 $ n        : num [1:5] 50 100 150 200 250
 $ p        : num [1:5] 0.1 0.2 0.3 0.4 0.5
 $ w        : num [1:5] 1 1.2 1.3 1.67 2
 $ is.approx: logi TRUE


Given time, I would like to expand this to more distributions like negative binomial, Poisson, and arbitrary discrete distributions. I would also like incorporate more approximation methods, including the one investigated by Michael Stephens (SFU) and Ken Butler (UofT) that inspired this work originally.

As always, feedback and bug reports are greatly appreciated on either the comments here or at jack.davis.sfu@gmail.com

Tuesday, 10 March 2015

Improving the Exam Experience - A Research Proposal

I've read through quite a few reports on improvements in university learning in the last year. A few ideas have made quite an impression. Ideas like the flipped classroom paradigm, in which the lecture time is spent working through homework problems with the guidance of the instructor and peers, and instruction and note-taking happens before the class with the use of online videos.

However, I've only seen a few pieces on improving the exam experience. Students spend a lot of time thinking about their exam performance, but besides writing good questions (which is a difficult art in itself), not much thought on the instructor side seems to have gone into it.


With that in mind, I propose to redevelop the experience of the sit-down-and-write style of exam. Specifically, I want to see if there's a way to allow to students to customize their own experience in ways that still maintain fairness such that every student is given the same exam.

Each of the following environmental factors is, I assume from my very limited background research, capable of improving exam performance either by quelling anxiety or by aiding recall of information. However, the idea isn't to see if there are differences between the factors by some factorial design, but to see if students can improve their performance by making their own informed choices of their individual exam environments.

Exercises


Circulation is valuable. Also, I've heard in a study skills seminar that there are specific exercises you can do to improve the use of both brain hemispheres at once. While exams are being handed out, an examiner could incorporate these exercises into their speech at the beginning of the exam. Students will decide on their own if they want to participate.

Those include, in order of increasing obnoxiousness:

  • Running a pen, pencil, or finger closely in front of your field of vision in a figure eight such that it's only in front of one eye, then the other, and back again a few times.
  • Patting your head and rubbing your tummy. Switch hands. Repeat.
  • Gently pinch your nose with your left hand, and wrap your right hand around to pinch your left earlobe. Once you've done that, switch hands and lobes so that your right hand is on your nose and your left hand is on your right earlobe. Switch as fast as you can without assaulting someone or choking on your gum.


Paper Colour

Credit to Linda Noakes for teaching me this, and to Carl Schwarz for confirming it independently. Colours matter for evoking certain mindsets. For example, orange is associated with increased anxiety. Since many exams are already given in multiple formats with the questions in a scrambled order or slightly altered. Using exam-positive colours seems like an easy transition.

Students could choose, by way of seating choice, between neutral white paper, blue papers for calmness, or green for memory enhancement. If scrap paper is involved, choices don't even need to be related to seating.


Priming Questions


Priming is centered around the idea that making someone process some idea causes them to take on some of the traits of that idea. The most common example I've seen is where subjects are asked to arrange a  simple sentence by arranging four out of five words from a list, and do to this for a collection of lists. However, within each list is a word associated with a particular abstract idea, such as old age ("wrinkled","Florida", or "retirement"), or money ("cash","investment", or "retirement").  The groups primed for old age walked out of the testing room slower than those in a placebo group; the groups primed for money acted with less empathy than a placebo group in an exercise that followed.

It has also been used in an exam setting to positive result by having students start a test by answering a question about being smart or imagining themselves in a position of esteem based on intelligence.

In the proposed exam environment, the first graded question of every exam could be to choose one of a few priming questions to be answered in 1-2 sentences, such as

  • "What would like to study if you were a top scholar?", 
  • "Describe the best you, you can be."
  • "What makes your ideal vacation spot so special?"

My main concern is that, given I want all the student decisions to be informed, and the priming examples I've found were of unintentional or covert priming, how well would this work for questions that are transparent like this?

Music


We have gotten used to working in environments with music on almost constantly. A silent exam room is unnatural and silence effectively amplifies any disruptive sounds that do happen. In most cases, exam takers are free to bring in their own earplugs, which some do, but that's still only a partial solution.

It would likely be disruptive to have music playing for everyone, regardless of how innocuous. To allow students to bring in their own music is to allow exam related material in. However, mp3 players are cheap enough now that one could be assigned to every student with their exam without increasing the cost substantially, assuming that most of the mp3 players are retrieved afterwards. A quick search finds 4Gb (~500 songs) mp3 players being sold for $13 individually (Tiger Direct), which suggests they could be purchased for $10 in bulk units of 100 or more.

Each mp3 player could be loaded with the same playlist, and the playlist could be composed of copyright-free music selected by the professor and the students. I doubt it would be particularly hard to make a way to load dozens of mp3 players with the same playlist automatically.

Ethics Considerations


If these factors aid students, I'd want the world to know. That means results need to be published, and that in turn means this is a proposal of research involving humans. The main potential for harm is in accidentally making someone's exam more difficult than necessary by adding to it the burden of having to make additional decisions, or by aiding students unequally by their choices. For example, if the students that chose the green paper and chose not to listen to music did better than the rest, am I doing a disservice to the other students. Exams would have an open question at the end regarding feedback, and I can check for effects from specific choices with multiple regression and perhaps with propensity scores, and scale as necessary.

Also, students will be fully informed of the intent and studied effects of each of the environmental factors listed above. A lot of literature review is required on my part to verify each of these, and to hopefully find additional factors as replacements or additions.

Sunday, 8 March 2015

Kuzdu Update - Lock and Key

New Material:

First, there is a new dungeon available: Lock and Key, which features a pair of new spaces. Most of the map is sparse for treasure except for an optional room in the middle that become available about 2/3 through the map. I'm worried that on its own the treasure hoard is available too late to matter, but it could be just right for any meta-game rewards that are incorporated later.


The rules have been made more explicit in some cases, and the relic deck has been increased from 16 to 20 cards. The card "Drill Relic" has been replaced - see patch notes for details.




Other Links:


Patch Notes:

I've made some changes to Kudzu's presentation intended to make it scale up more smoothly. 

- First, I've split off each dungeon into its own document. Each dungeon module includes the playable map for that level, a guiding layout, and an explanation of spaces found on that map. If there are any cards specifically for that map, they will be included.

- Generic Kudzu material such as rules and cards, however, are only found in a separate document. This way, updating the rules means changing one document instead of several. It also reduces legacy issues with players applying older rules to newer versions of the game.

- The card "Drill Relic" has been removed from the game for now. Having the ability to place tiles on walls is exciting, but has the potential to ruin any map where a major barrier is a single wall thick, as in Lock and Key. Cards with wall-breaking abilities will appear later, but attached to specific maps as starter cards.


Other thoughts and plans:

- The next module, Ulee's Pitfall, will not introduce new tiles, but will focus on more challenging uses for letter tiles. Players will start with a Pillar Relic, which allows them to turn letter tiles (and other tiles) into walls.

- Another possibility is 'portal' tiles that allow the player to start a crossword at one portal location and continue it at another. However, there may be memory issues with that. I want to experiment more with "Lock and Key" and tracking which keys have been collected before I try that.

- The logistics of a print-and-play game are more involved than I expected. Making things portable is taking more time than I expected, but hopefully the modularity changes will improve the development flow and allow more time for the creative aspects of the game.

- To my knowledge, nobody is testing the game at the moment, and neither am I. Most of my energy is going into thesis work at the moment, so playtesting and finding other playtesters is going to have to wait until the defense.

- I've been investigating the prospects of making Kudzu an Android game, again after the defense. I want to make the game a net good for the environment, and it seems like the best way to do that is by having a running advertisement bar on the bottom of the game, with a large portion of the revenue going towards either planting trees or buying up carbon permits and letting the permits expire unused. There are some other ideas involving in-game currency and 'cashing out', but the logistics of that are a distraction from game development at the moment.