Crowd counting 2: MCM Birmingham Comic Con

Following on from our previous experiment, the researcher repeated the crowd counting exercise in a more crowded environment, with a more complex task for Amazon Mechanical Turk (MTurk) to attempt.

(Read Part 1 to get the intro, method and background)

We will also be talking about crowd densities, comparing different types of events (Comic conventions) and looking at the size of the National Exhibition Centre (NEC).

Attendance and the size of the NEC

The National Exhibition Centre: “…has 20 interconnected halls, set in grounds of 611 acres (2.54 km2) making it the largest exhibition centre in the UK. It is the busiest and seventh-largest exhibition centre in Europe.” (link)

MCM Comic Cons are: “…the largest Modern Pop Culture event in Europe and one of the largest in the world, behind San Diego. Comic Con and New York Comic Con.”

According to the MCM 2017 Media pack (link) the Birmingham Comic Con has an attendance of “35,000+” though it is hard to determine what the equates to in terms of actual individuals, for instance, presumably, one person attending two days counts as two attendances.

New York Comic Con and San Diego Comic Con, probably the two largest in the world are quoted at 180,000 and 130,000 respectively. (Link)

As the NEC has 20 different halls, the layout changes from year to year. There are usually two events per year (March/November) The Birmingham March 2018 event was held in NEC Halls 3a and 4, which you can see on this interactive map (link)

These two halls together would be 24,595 square meters, so if all 35,000 attendances were unique individuals, and they all turned up at once, and there were no stalls or stages taking up floor space, this would be 0.7 square meters per person. 0.7 square meters per person doesn’t sound like much; and once stalls and stages are taken into account, the figure would probably be significantly reduced, by as much as half. So, we know that 35,000 attendance doesn’t equal 35,000 individuals in the halls at the same time.

If you would like to see some visualisations, Dr Keith Still is an expert on crowd density and safety, on his website you can find some examples of different crowd densities. (link) Dr Still suggests that 0.2 square meters per person is approaching the maximum ‘safe’ level of crowd density (500 people in 100 square meters). “Above 5 [people per square meter] the RISK of trips, slips or falls increases significantly.” Of course the movement and temperament of the crowd in question makes a huge difference too! The ‘safe’ level may be halved if people are expected to be walking, just think about how much extra floor space you take up when you start swinging your arms and legs around. (Or is it only me who walks like that?)

Compared to other venues, exhibition halls clearly have quite a variable capacity; if a theatre has 500 fixed seats, then it is very difficult for them to alter this, whereas the amount of space available to an exhibition hall could vary greatly depending on how much floorspace is taken up by stalls, stages, eating / queuing areas etc. (From the maps provided by MCM, I would estimate that maybe half of the floorspace is taken up by stalls/stages, assuming these are to scale). The concern is probably more about crowding in certain specific areas, rather than over the whole area of the halls.

I have looked around and can’t find a quoted maximum capacity for the NEC or any of its individual halls, which makes sense, as the format could vary so much. In some respects it’s more like managing the layout of an outdoor event than a fixed venue.

Looking to a different event at the NEC, the Insomnia Gaming Festival moved to the NEC from the Ricoh Arena in 2015 as a result of needing more capacity; ‘over 36,000’ according to this article. (link). The same article quotes founder Craig Fletcher as “expecting a total of over 100,000 visitors through our doors in 2015”. It seems that by moving to the NEC, Insomnia was hoping to potentially treble its capacity. (Although their own website also quotes 150,000 visitors per year, though that could include other events – link)

Insomnia62 in 2018 took up Halls 1-4 of the NEC. Halls 1-3 are 36,120sqm, plus 3a+4 as before 24,595sqm for a total size of 60,715 square meters. However, Insomnia has an indoor camping area which is a particularly innovative/weird/fun idea (delete as you find appropriate) which by the looks of it takes up the whole of the hall (though it does not fill up the entire hall). (link). Apparently you get a 2m x 2.5m size pitch. (link)

i56 - Indoor Camping

Also, compared to MCM Comic Con, Insomnia takes place over 4 days (Fri-Mon) rather than 2 (Sat-Sun), so those ‘over 100,000’ visitors may be much more spread out. For completions sake, taking all the same assumptions as before, Insomnia would have 1.6sqm per person compared to MCMs 0.7sqm, though this is a very abstract comparison to make.

What could all of this talk of people per square meter really be any use for? Well, for a general guesstimate of how crowded a certain area was at any given point; eg if you were at MCM and had less than 0.7sqm to move in, you are probably in a relatively crowded area of that particular event. More than 0.7sqm, you are in a relatively quiet area of the event. With further refinement, this method might be a useful indicator for event managers, sponsors and stallholders to bear in mind when comparing the logistics, risk and cost effectiveness of different approaches or layouts. How much footfall am I getting at my stall? Am I in a busy or quiet part of the event? Is my business affected by being too busy or too quiet?

In both MCM and Insomnias cases, the headline 35,000 and 100,000 attendance figures don’t seem to sound too unrealistic, though these might be best understood as ‘realistic maximums’. They do not account for unique individuals and are going to conflate multi-day ticket holders with day ticket holders.

Furthermore, do these figures potentially include stallholders/workers too? For reference, a different type of event, but 21% of the sample from the Baker Associates Glastonbury 2008 impact report were either volunteers, traders or performers/crew. (link) In the context of a 177,500 capacity festival, that is some 35,500 ‘non-audience members’. A Comic convention probably does not require so many workers/volunteers, though I do now wonder what that equivalent figure or proportion would be. This is not to suggest that an event must be ‘as much audience as possible’, workers and volunteers will probably all enjoy the event too, spend a bit of money, need facilities etc, but their needs and patterns are likely to be different.

I would also assume that the NEC and the event holders probably do have a specific figure somewhere for their target people-per-square-meter. Or perhaps it is a range, eg: X amount is safe but will feel a bit quiet, Y amount is busier, Z amount is the absolute maximum. Maybe X is okay for this type of event but for Y we want people to feel a bit more relaxed. And I haven’t even thought about the effect people buying tickets and not showing up, or people coming and going from in and out of the main halls for food or other reasons.

To quickly test the reliability of the figures, for instance, in the MCM media pack, the major London event has ‘135,000+’ attendees, so we would assume it is around four times ‘bigger’ than the Birmingham event… and it is, assuming that all 90,000sqm of the ExCel centres two halls are used, to Birminghams 24,595… which would be 0.6sqm to 0.7sqm so although it is bigger, it is about the same density… … but this takes place over 3 days rather than 2… anyway…but what about ‘Kids go free’ offers…help…too much numbers…maybe another time.

Anyway, on this basis, I would at least expect the London and Birmingham events to both feel roughly as crowded as each other despite being quite different in terms of overall size.

Anyway, lets get on with the crowd counting experiment, but if you want, this is a good bit of further reading about the visitors and traders that go to comic cons and the general growth they have seen over the last decade or so. “15 years of MCM London: (link)

The field work

Again, see part 1 for more detail.

The camera was set to capture a photo every 10 seconds, other exposure settings were not adjusted.

A monopod was used to raise the camera to a higher vantage point – not sure precisely how much, probably around 1-1.5m.

The camera took 20 photos over the approximately 3 minute period (190 seconds).

This was from a position towards the east edge of Hall 4.

Using Google Maps, and the skylights in the photo as a reference, places us about here.

The circle and green lines above show roughly 40m radius from the central point. Despite the 360 degree camera, in this location we are effectively only looking at the front 180 degrees (though you can see a little bit behind, underneath the striped entrance/exit ways). Hopefully you can make out the green lines in the picture below, and the location of the skylights in the roof (the bluer lights)

In the first experiment, the researcher manually counted heads in every frame taken, before uploading the images to Amazon Mechanical Turk (MTurk) for remote workers to carry out the same task.

The important differences this time were:

  • The researcher only counted 4 frames (1 every 50 seconds) before uploading all of them to MTurk. This was to see how accurate the method may be with less input from the researcher.
  • The task was to count both people in costume (C) and those not in costume (N) to create two separate figures (in the photo above, 18 in costume:yellow mark, 84 not in costume:blue mark). Other ideas could be, people waiting/shopping at stalls, people walking, children/adults… What counts as a costume? Well, it’s in the eye of the beholder, this is part of the experiment.
  • The 40m range (the green line) was set as there was not sufficient resolution/quality to be able to reliably make out individual figures.
  • This green line was carefully added to each image by hand, probably some variance there, something to improve in future. Also assuming the camera was held at the same height throughout…
  • The instruction was not to count heads specifically, but that any body part underneath the green line would count. Therefore someone ‘cut in half’ by the line would still count. The area on the right included seating and tables, therefore there are some people who are only partially visible, you can only see their heads, these are counted.
  • As before, two tasks were set up for each image, therefore two workers would each give unique responses for each image, and the average of these two could be taken.
  • The incentive per task was $0.08 (compared to $0.05 before).
  • Over the 2-3 days the batch was active, 129 tasks were completed, though 101 were rejected for giving what were viewed to be ‘junk’. This was quite a lot longer than the first experiment, but was mostly down to the availability of the researcher to confirm or reject submitted tasks. The total charge was $3.80.
  • What a ‘junk’ response is, is up for debate but the rule of thumb was any responses totalling 20 or less would be assumed to have deliberately or accidentally misunderstood the nature of the task. (Nevertheless, some errors sneaked through)
  • This was somewhat confirmed, as the average time spent on a rejected tasks was 70s, while the average time for an accepted task was 173s. Unlike the first experiment, virtually all individual workers only completed 1 task.

In summary, the task was much more complex by a number of aspects. What were the results? Based on the four frames manually inspected by the researcher:

Researcher Mean Median Min Max Std Dev Variance
(C)ostume 22.8 24 18 26 3.4 11.6
 (N)ot in costume 82.3 82 74 92 7.7 58.9
(T)otal 105.0 104 98 115 7.3 52.7

Firstly it may be relevant to note that the Standard Deviation and Variance are relatively similar to those reported in the first experiment (3.9 and 15.2), although this is only the case for counting the relatively smaller number of people in (C)ostume. For our purposes, this is a general measure of whether the counts are wildly varying though given the subject, we do actually expect to see some variance.

Also note that these figures didn’t change much (from experiment 1 to 2) when comparing stats from inspecting every frame, to inspecting only one every minute or so. Conclusion: whatever the precise accuracy of doing this manually is, it seems to be fairly consistent with one person doing the analysis.

Depending on needs, it may be necessary to sample at different frequencies, it could be that a specific occurrence (headline act takes the stage, a sudden rush to the exits) necessitates more detail; or it may be more beneficial to capture less frequently but to spread this out over a longer time of the event (over several hours rather than a few minutes).

Let’s see what the MTurkers have to add to the picture. In the following graphs, the larger points are the Researcher, the smaller points with line are the MTurkers. Here are the totals:

And here are the (C)ostume and (N)ot in costume:

And here is the underlying MTurk data.

Mturk Mean Median Min Max Std Dev Variance
(C)ostume 21.4 23 5 40 8.5 73.1
 (N)ot in costume 68.5 71 41 100 16.9 286.1
(T)otal 91.4 91 70 130 17.7 311.7

For a comparison of averages:

MTurk Researcher
Mturk Mean Median Mean Median
(C)ostume 21.4 23 22.8 24
 (N)ot in costume 68.5 71 82.3 82
(T)otal 91.4 91 105.0 104

There is surprisingly little difference again, although the majority of the difference is in people (N)ot wearing costumes and probably due to just being the large group overall. Despite being concerned whether remote workers would be able to identify what ‘counted’ as a costume or not, it actually seems to be that in this case it is quite easy to miss the comparatively unremarkable and normally dressed people; or at least, if you are rushing through a remote work task, these are the people you may be likely to miss.

The difference between averages (means) for MTurk and the Researcher were 1.3 (C), 13.8 (N) and 13.6 (T), as a proportion of the relevant samples (assuming that the Researcher was 100% accurate) would be 5-10%.

This is similar to the first experiment, but it is clear that larger, busier scenes are harder to analyse. Due to the way MTurk tasks are set up, it may have been more accurate if (C) and (N) were addressed by separate tasks, rather than counted in the same task. Adding more repetitions of the same task is an option; and a fairly cheap one at that, though I would imagine it would continue to reinforce the existing trend.

Introducing a new and ultimately user-interpreted variable (Costume or Not) also lets us compare the Researcher and MTurk in this respect, and with no particular instruction, it is interesting to see that the MTurk remote workers had a similar view of what counted as a costume as the researcher did.

Further discussion

Based on info from the first section of this post, of the 35,000 people estimated to attend MCM Birmingham, and based on the relatively tiny sample from this analysis, around 20% of attendees came in costume; or 7,000 people. With more or longer samples taken in different areas of the event, the estimate(s) would likely be different, but this is one possible way of asking additional questions from fairly simple to capture photo or video data.

Based on an extremely unacademic browse of Google, some random people on this forum seem to also think 10-20% at a comic convention is about right (link) while other sources suggest the number is growing, which might be changing the nature of such events (link). While there IS academic literature out there about conventions, cosplay and fan culture, it mostly seems to be about exploring the meanings, psychology and culture of it all (fair enough) but not so much those lovely crunchy numbers that I like to know about. Presumably the conventions themselves have some indication, from post event surveys etc.

Going back to our circle with ~40m radius, this would cover an area (360) of about 5000sqm, although the area visible within these photos (180) would be around 2500sqm. Still, given that the Hall involved here (Hall 4) was 16,700m2 you could perhaps suggest that two or three cameras placed in centre of the room would cover a big chunk of the space. However there is an issue with height and sight lines; the higher up the cameras get, the further they are from their subjects and the less effective range they have, however with the cameras relatively low, it is likely their sight lines would be blocked by stalls/posters and people. Maybe a true ‘birds-eye’ view, directly overhead looking straight down could do the job at this sort of scale. However for events where people are mostly at the same height, this approach still seems to be flexible and reasonable to undertake.

I wonder if Andreas Gursky has ever wondered about this sort of thing.

So, with our 2500sqm semi-circle, we have counted an average of about 90 people and a maximum of about 130. This gives us a relatively huge 27-19sqm per person, compared to the tiny 0.7sqm we were talking about earlier.

Again, this is a fairly abstract comparison to make and two very different methods but it does further make you wonder about how the 35,000+ attendance figure (or any attendance figure really) breaks down in reality. Obviously there are more and less crowded areas of any event at various times for various reasons! (For reference The average room in a new house in the UK was 15.8sqmlink)

(Sidenote: the 2017 rate card showed ‘Regional’ prices for stalls at £130+VAT per sqm…if they covered 50% of the floor space at Birmingham, would that be around £2m in stall fees?… a discussion for another time!)


We’ve expanded on the first experiment, with a more complex task (two categories) and across a larger area (40m) with a reasonably similar level of reliability. And we’ve been hugely, but hopefully interestingly, sidetracked by talking about comic conventions.

With this method, expanding the time covered is the principle area for future experiments to try, followed by further additional characterisation of the people in the images: for example, moving, cheering, taking photos, children/adults and so on. The format of MTurk tasks may be broken down further, asking questions individually rather than together, perhaps setting up more tasks to take averages from.

Stay tuned for more.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s