Saturday, September 24, 2016

Carbon capture, perpetual motion machines, and IGORs

There's one quick rule to evaluate energy-related technologies: if you can turn them into perpetual motion machines, they aren't real.

In conversation with an IGOR (Ignorant Grandstanding Oblivious Rabble-rouser), I pointed out that the idea of using atmospheric carbon dioxide to make fuel isn't entirely new (Nature did it first), but the technologies being proposed aren't realistic, for the reason above.

IGOR countered that these processes could, in his view, be the solution to our energy crisis (do we have one?), because the fuel produced by carbon-capture will provide the energy to keep the process going.

Ahem. Let's think about this, with a diagram:



What reasonable people say is that the energy extracted from the fuel will partially cover the energy needs of the capture and conversion process (that is $x > y$ but not by much); what IGORs say is that $y>x$. But if that were so, we could feed the exhaust from the energy production system into the input for the capture system, and get a perpetual motion machine that generates free energy.

Some of the more reasonable proponents of this carbon-capture and conversion idea suggest that the energy coming in can itself be green energy, like solar, and therefore there's a net "carbon-based" energy coming out of the system. Two points:
First, that's fine, but then why use part of that solar energy to create carbon-based fuels, instead of using the solar energy to replace the carbon-based fuels? Note that any $\mathrm{CO}_2$ that gets turned into fuel will yield another $\mathrm{CO}_2$ after the energy generation (conservation of the carbon), so no advantage there.
Second, the designs proposed look extremely wasteful of energy: capturing $\mathrm{CO}_2$ after it has diffused into the atmosphere is bound to require a lot of energy to flow non-$\mathrm{CO}_2$ gases in the atmosphere through the carbon-capture process. Better to stop $\mathrm{CO}_2$ at the source, if that's what you're after.
Of course, as I mentioned, Nature does provide us with a technology to use solar power to capture $\mathrm{CO}_2$ and turn it into fuel:



It also has the advantage of being pretty, giving shade, operating in silence, and bearing fruit. Trees. It's trees. Let's plant more trees. I like trees.

One particularly oblivious IGOR insinuated I was anti-environment because I prefer trees to useless noisy subsidy-harvesting machines.

With friends like that, the environment is doomed.

Wednesday, September 21, 2016

Geeking out midweek...

Just a few pieces of geekery; the last few days have been a coding ultramarathon, and as any strength athlete knows, marathons are bad for you.


The internet, good only for gossip and pornography... oh, wait

MIT professor Thomas Eagar has a number of fun stories about materials and material science, and also explains how shaped charges work. These stories are also illustrations of how engineering gets complicated once one goes beyond the block-diagram understanding that most people who "like science" get from popularizers and documentaries. (Don't get me started about documentaries.)




Here's the companion piece "Bringing new materials to market" that Eagar mentions in the video (PDF).

(That's pretty good teaching for MIT. No, I'm not being facetious; that's really much better than the average pedagogy at MIT. Don Sadoway and Walter Lewin are exceptions, not the rule.)


David Brin (one of the 3 Killer Bees of science fiction, with Greg Bear and Gregory Benford) tells XPRIZE what's wrong with most contemporary dystopian fiction:



Via Myles Power, we learn about Moms Across America and their revolutionary take on chemistry and the physiology and biochemistry of inflammation


"This hydrogen is the small particles, not the large particles," so not H-235, then? Whaa...?




Stuff I like, and might even want, but don't need, so won't buy

Hasselblad, the camera company that makes other camera companies look bad, has a new camera, mirrorless. As I said in my Less post, the limiting factor on my photography is my skill (comma lack of) so I don't need a new camera; but I can lust after it:



Penguin Prof made a video describing the equipment she uses for her MOOCs and online science videos. (I might get the Astropad software, since it appears to solve an important limitation in using the mac for highly interactive presentations.)



I used to have some educational videos in my channel, but for reasons that I won't discuss I have taken them private (or unlisted); if I was still in the mass education business, I'd probably be making some videos to support classroom activities and would invest in some better equipment. For the kind of videos I do (mostly for fun), it would be wasteful:




And a 17-inch tablet from Dell? Obviously I would like to consider it, but unless there's a touch-enabled version of Linux around for it, I'm going to pass. Need to save some money for that Apple-McLaren X1 self-driving supercar...




Some motivation from the great philosophers at T-Nation





https://www.t-nation.com

And by far the most insightful:


This wisdom comes courtesy of Testosterone Nation online

A few powerlifter jokes:
  • What do you call the bodybuilder with half a brain? Gifted.
  • Why do bodybuilders congregate in groups of three? One that can read, one that can count, and one to keep an eye on the dangerous intellectuals.
  • Why is the Westside Barbell 5-by-5 program so popular with bodybuilding coaches? Because they can count sets on one hand, reps on the other. [It's an old program of five sets of five reps.]
  • What do you get a powerlifter who does Crossfit? An ambulance.
  • Why do gymbros use the squat rack for standing curls? Because they see real athletes use the squat rack (for squats) and think the secret of strength is the location. [Gymbros are annoying wannabe bodybuilders, the type that always skips leg day.]
  • If a vegan also does Crossfit, what does he harangue you with first? 
  • What are the three most important points of a gym for gymbros? (1) free wifi to post every single rep to Facebook without hitting the phone data cap; (2) mirrors everywhere to strike a pose after every set; (3) no powerlifters around so that they can believe they're real athletes.
  • Monolifts are very confusing to gymbros. They can't figure out how to do standing curls in them. [Monolifts are for heavy squats; gymbros use the squat rack, improperly, to do standing curls instead of squats. The "therapy" image above has a monolift in the foreground.]

Sunday, September 18, 2016

Just another geeky Sunday

Irritating science-related ignorance

Thunderous foot-in-mouth. Well, Thunderf00t has another video about SpaceX where he shows he doesn't understand why Elon Musk wants as much information as possible (i.e. other people's videos, to add to the ones SpaceX made and the 3000 streams of data collected). TF shows his usual depth by mocking SpaceX for not investing in "cameras costing 100 bucks." SpaceX had a lot of cameras on the pad and off, but the more information the better.



At this point is Thunderf00t going the Charlie Sheen way, where his audience just tunes in to see how bad he's going to screw up this time?

On the upside, I found, via Professor Moriarty (yes, like the Sherlock Holmes villain, except it's Philip, not James), a series of videos debunking TF's previous parade of ignorance (regarding the Hyperloop):


My modest contributions here and here. Also a comment I made on The Arts Mechanical, capturing the genesis of the problem with scientists commenting on engineering:


When he's not planning the paint-napping of the Mona Lisa to effect the sale of several forgeries to unscrupulous art collectors, Professor Moriarty collaborates with Sixty Symbols to explain Physics concepts:




Popular Science. More relevant than Thunderf00t's demonstration of what happens when scientists assume engineering is trivial, is Popular Science, which is a respectable STEM popularization magazine. And when Popular Science writers make mistakes, they have more widespread audiences.

In a pretty good article about Jeff Bezos joining the "lets get out of Earth" club, a Popular Science writer shows a clear misunderstanding of what's difficult about getting to orbit (rather than "space," arbitrarily defined as 100km altitude):


The problem isn't getting to altitude, the problem is giving the payload orbital velocity, which for Low Earth Orbit is in the order of 28,000 km/h. (My tweet said "height" because 140 characters, haters.)

This came up before when people were comparing Blue Origin's landing of a space-going rocket with SpaceX landing of the first stage of an orbital rocket. The velocities involved are completely different as is the fact that Blue Origin's rocket was basically going up and down, but the Falcon 9 goes mostly Eastward, so it needs to lose horizontal velocity as well.

This is a very common misconception, and one that many people don't understand. It doesn't help when popular entertainment feeds it:



If the missiles had gone straight up, as they say in the movie, they'd fall back to Earth.(Yes, that's me criticizing the Physics of X-Men Apocalypse.)



Book mini-review: Somewhither by John C Wright

This book came to my attention in a serendipitous way, which as I'll note at the bottom of the review is curious. It won an award at a conference, which in general has no impact on my choices since I read the free sample and decide then. (Except when I have a long history with the author, like Pratchett, Stephenson, Clarke, Pournelle, Heinlein, Benford, Brin, Bear, Sawyer, and maybe 10-15 others.)

So, for me the sample it is. And it was. And it did. Did convince me to buy. (My decision is not about the money, since fiction is very cheap; it's about buying a book I won't read and its file on my Kindle mocking me for eternity.)

Well played, Castallia House: a long sample is the way to go. This book was described as "science fantasy," which I was weary of. I like science fiction as long as it's mostly about the science part; in the past I've began reading fantasy once or twice only to stop after a few pages; it's not my thing.

Yes, I get that "Alex teched the tech and the enemy ship exploded" is about as scientific as "Chris invoked the secret ritual of the Obelisk and the enemy cavalry turned to dust." But I don't read scifi of the "Alex" type either.

(I read Terry Pratchett because that's not fantasy. Like the famous trilogy in five parts plus extra volume, Pratchett's work is social commentary on our world.)

The interesting combination of "science fantasy" is aptly described by one of the characters from, let's call it, "a magic-based civilization" commenting on our technological-based civilization:


The book-world is internally consistent and it fits (well, sort-of, but spoilers) with our actual reality (so that's the science part); there's a reasonable (again, sort-of but spoilers) backstory that sets up the differences.

My only point to pick with the book is that there's a lot of sadism, torture, and malevolence; it fits with the story and the general logic (and that's all I can say without spoilers), but reading it sometimes felt like being inside a Hyeronimous Bosch painting.

Of course, Somewhither is part one of a planned trilogy, so it's a bit irritating to have to wait until the next two are written. (But as Sir Humphrey Appleby aptly pointed out to the minister, "there are significant difficulties in circulating papers before they are written," so we'll have to wait.)

Meanwhile, I realized that this John C Wright was the same John C Wright who wrote the scifi series Count to a Trillion and sequels, which I liked a lot; when I say I'm not a people person, that's what I mean. I read the book descriptions and tend to skip that "author" tidbit.



Nassim and the Intellectuals

Nassim Nicholas Taleb expands his old "Intellectuals Yet Idiots" Facebook post to article length. Many hurt elite feelings ensue.

No, I won't summarize it, just read the whole thing. And the comments.

Here's Nassim in his Twitter persona, making friends and influencing people:


Ok, the end of the article (did you read the whole thing? Go read the whole thing!) is worth reproducing here:

Well, I'm safe, I deadlift. (Yes, this article about the decline of intellectual discourse ends with a "do you even lift, bro?" gym taunt and a photo of Zydrunas Savickas.)

Here's a ode to intellectualism. Or Zydrunas Savickas, one or the other.





Videos about many things

Numberphile features Persi Diaconis explaining what makes a fair dice:


And here's a paper on fair dice by Diaconis and Keller.


I watched a few train videos, too. No, I'm not on any recreational drugs. It's just relaxing to see these big machines in motion. And went over some of the back-catalogue of Agent JayZ's Jet Engine videos.

Agent JayZ gets the "most Canadian of Canadians" award for responding to people commenting on something that happened at an air show they hadn't attended with: [mildly tired tone] "It would be great if the people who weren't there stopped telling me, who was there, what happened."

Ah, those Canadians and their flaring tempers. Must be the snow.


Pure computer envy (but it should be running Linux):



Something more technical: K-means and image segmentation, by Computerphile



Related: as a professional presenter, I always use the very best in audio-visual techniques when explaining the difference between supervised learning (like choice modeling, left below) and unsupervised learning (like cluster analysis, right below).


Yes, that's quad paper. Better yet, quad paper with color pens. Oh how far we've come from explaining EECS with an automatic pencil on grease-stained napkins.



Gym-related geekery

I finally accepted that I'm recovering too slowly for a standard three-day split, so I'm moving to a four-day split, adding one day for accessory work and conditioning. But I'm doing it differently from the usual four-split day programs:

1. I'm still only going to work out three times a week, so I'll be on a nine-day recovery cycle instead of a seven-day recovery cycle. I know gymbros would have a problem with the math, but it isn't that difficult: just learn the sequence of workouts and check which one you did last.

2. I'm placing the accessories/conditioning workout between squat and bench day, rather than at the end of the three-day program. That's because doing conditioning so close to squat will hurt the squat much more than doing it close to bench hurts the bench.

This change will not affect my rowing and the low-intensity (not workout, rather active rest) "cardio" exercising while imbibing knowledge.

As per T-Nation's famous motivation poster, "Motivation is for newbies; veterans grind." But still, I found this excerpt from the Arnold Strongman Classic 2016 motivating:



It certainly motivated me to stop whingeing about my posterior chain on Saturdays (until now Friday was deadlift day) after I saw Eddie "the Beast" Hall deadlift 469 Kg:


(Though in my defense, scaled to bodyweight, I deadlift more than Eddie...)



Hidden nostalgia

The title is a play on:


Saturday, September 17, 2016

The problem with wireless earbuds for audiophiles

(The lack of an headphone jack on the iPhone 7 upset many people, but not me. I generally don't use my iPhone as a source of music, and if I were to do so in the future, I'd use an external DAC/Amp.)

To see why the wireless earbuds are a problem for audiophiles, we need to begin at the opposite end of the process, when analog signal (music) becomes a digital representation.

There are two steps in the process: first, the continuous analog signal is sliced in time, "sampled," so that it's now represented by a sequence of analog levels; second, those analog levels are compared with a finite scale, the digital scale, and the best approximation is used to represent the level, thusly:


There are two sources of information loss (or "noise") in this process:
1. By taking level slices of a continuous curve, the sampling creates an imperfect representation of the curve; that's called sampling noise. The longer the slices, that is the less often the analog input is sampled, the higher this sampling noise. 
2. By forcing the analog samples, which are on a continuous scale, to match the limited levels of a digital scale, the process creates a second type of noise, quantization noise. In the above example, the difference between the digital output for periods (1) and (2) is higher than the difference between the analog samples for those periods. Also, periods (2) and (3) have the same digital output, despite the different analog sample levels.
To reduce sampling noise we can sample more often, that is have thinner slices of time so that there are more analog samples to represent the same curve. To lower quantization noise we can have more digital levels; typically the number of levels is a power of 2, since we use binary coding.

For example, CD encoding used 44,100 samples per second per channel at 16 bits of resolution (allowing $2^{16}$ or 65,536 different levels); this was deemed enough for music since it allowed for an upper frequency limit of over 20 kHz (generally considered the limit of human hearing) and a dynamic range of 96dB (each bit adds 6dB; the choice of 96dB was widely panned by audiophiles as too small).*

As with everything in engineering (and in life, really) this was a matter of trade-offs. Later we've gone beyond these limits with other standards like SACD, for example. But the problem of trade-off remains, typically that of space or bandwidth against quality of reproduction.

Before compression, the total number of bits necessary to represent a stereo signal sampled at a rate of $s$ samples per second and a number of digital levels $2^{N}$ is $2 \, s \, N$ bits per second; because there's a lot of redundancy in music (no, it's not just Philip Glass), there are opportunities for compression.

Sometimes the music is compressed without losing information, called "lossless compression"; an example is FLAC, which has all the information necessary to reconstruct the original uncompressed digital music file. This is similar to compressing a data file for transmission; after decompression the reconstituted file must be identical to the original. (FLAC uses regularities of music to compress data more efficiently than a general compression algorithm.)

Sometimes the compression loses information that is deemed unnecessary, called "lossy compression"; MP3 compression is lossy. Lossy compression adds sampling and/or quantization noise to the original data, though the design of the compression scheme is supposed to minimize the aural effect of those additional errors in some trade-off with the compression ratio.

On the other hand, because plastic CDs and wireless signals sometimes get damaged, some space or bandwidth has to be used for error-correction codes and other digital administrative minutia. When the iPhone connects to the earbuds by wire, it can send an analog signal, but when it connects via Bluetooth, the signal is digital, must be compressed for transmission and requires a lot of network administration detritus.

So, one of the first questions wireless earbuds raise is: is Apple sending enough data over that Bluetooth connection for an audiophile? This isn't the only question, though.

Using the very best in advanced engineering CAD displays, we can see that this is only the first of four classes of problems:

Four classes of reasons why wireless earbuds are not for audiophiles.



Problem class 1: Quality of the Data

Apple's decision to go wireless changes the transmission of data between the main processor and the digital-to-analog conversion from a wired connector inside the phone, and protected from most interference, to a digital transmission over a noisy channel (Bluetooth). That means that a lot of other things have to be transmitted, in particular handshaking data, error-correction codes, and diagnostic signals.

The problem is mostly that Apple went from being a perfectionist's personal fiefdom (during the reign of His Steveness, may his divine hand bless you with a bounty of new MacBookPros) to being a company looking to make a buck. And companies looking to make a buck make different trade-offs.

His Steveness wanted the best. He might not have gotten it always, but he made products for people who wanted to brag they had the best. (Even when by all objective measures they didn't.) But now, the whole company seems to be into the "milk our brand while it lasts" phase of its corporate life cycle, so I'd venture that their trade-offs are much closer to the general public's than those of the fringes.

His Steveness run the company targeting the fringes, so that the general [Apple-buying] public could pretend aspire to be in the fringes. Depending on who you ask that's "aspirational marketing" or a "reality distortion field."

Not anymore. Not for Apple.

Taking a lossy compression like MP3 and compressing even further for the earbuds (possibly limiting both the frequency and the dynamic ranges) isn't a recipe for audiophile sound. It does work for phone calls, and that's probably what most phones are used for. But for music... no.

(At this point I should mention that digital audiophiles have moved on from Apple a while ago, putting up with miserably bad less than optimal interfaces to use things like —to go entry-level— a second-generation Fiio X5.)



Problem class 2: Quality of Digital To Analog Conversion

A second source of problems is the digital-to-analog conversion circuitry. Among the many problems that can come from a cheap (and low-power, which is important in wireless earbuds) DAC, the most obvious are reproduction errors (the same digital input doesn't map to the same voltage consistently, or the difference between digital levels doesn't match to the appropriate difference in voltages). This isn't that much of a problem in 2016 (it used to be in the 1990s).

Another, more serious problem has to do with the precision of the timing, which is one of the major reason why if you care about computer music you'll get an external DAC, possibly a Chord Mojo or an Audioquest Dragonfly. (Or maybe something from the brand that can't be named.)

Even small errors in timing (some of which are induced by the buffering and data processing necessary to extract the digital music from the wireless signal) can lead to significant phase distortion, in that the 'time' used to reproduce the music doesn't match real time.

To illustrate this problem, consider the following phase-distorted sine wave (slightly exaggerated to make the case visible, but even very small phase distortions sound horrible):


Comparing the two periods of the distorted wave, T1 and T2, you can see that phase distortion in this case induces frequency variation. This means that instruments will sound as if they are out-of-tune, and [if you're over 30 you'll get this reference] like your brand-spanking-new iPhone is a cassette player running out of battery power.

If you accidentally downloaded a [poorly encoded] FLAC file from a torrent site you accidentally fell into while looking for a French Literature study group, accidentally run that FLAC file through a FLAC to MP3 converter that you accidentally had on your computer, then accidentally played it and noticed strange warbling and high pitch glitches, that's an entirely accidental observation of very bad phase distortion.

This is why any audiophile wants a DAC that uses its own timing circuitry and buffer, rather than depend on the shared circuitry involved in network management etc.



Problem class 3: Fixed- vs Variable-Gain Analog Amplification

Many computers (and I assume all iPods and iPhones) have a fixed gain amplifier for the reconstructed analog signal. That means that changes in volume are created by multiplying the digital signal by digital fractions prior to conversion to analog. In essence, removing data from the signal.

For example, to halve a digital number, all you need to do is shift all bits right, disposing of the lower-significance bit and adding a zero at the highest significant bit (or, depending on how negative numbers are encoded, adding a copy of the previously high bit). This means that one bit of data has been lost. The sound is not just half-volume, but also half-dynamic range; each halving of volume removes one bit or 6 dB of dynamic range:

$\texttt{[1000101001011011]} \rightarrow \texttt{[0100010100101101]} \rightarrow \texttt{[0010001010010110] }$

If the original dynamic range of the data was higher than that of humans (CD or CD-derived online purchase or stream? No, it wasn't!), then this loss isn't important. Otherwise (i.e. basically always), your music just became lower resolution.

In a better sound system (i.e. any external DAC/amp), the analog signal out of the DAC goes into an analog amplifier that has variable gain. In some systems the variable gain is controlled with a knob, in others using a digital interface. But in both cases the amplifier tends to be a digitally controlled variable gain amplifier, in which the analog signal path is all analog and only the gain is controlled by a digital system (typically a feedback network of switchable topology).

(An alternative approach is to take the, say, 16-bit data and shift it 8 bits up into the most significant bits of a 24-bit word, then multiply that by an 8-bit fraction (thus allowing for 256 different volume levels) and feed the result to a 24-bit DAC, whose result will feed a fixed-gain amplifier. This allows for the whole process to be digital as long as possible.)

The amplification issue alone is worth getting an external DAC; but it's important to also consider the next point.

You don't say, @AudiophiliacMan

(My Audioquest Dragonfly is usually plugged into a powered USB hub, so it doesn't rely on the computer USB bus power.)


Problem class 4: Power issues

And this is the big big one. You like loud music? Well, expect distortion as soon as the volume gets loud. Because most of these small batteries aren't able to deliver the current needed fast enough. So what happens is that as the output voltage increases by $\Delta v$, requiring a $\propto (\Delta v)^2$ increase in power, the amplifier "fixed" gain starts to decrease, more so the higher the $\Delta v$, and we get... well, we get this:


That compression of the sine wave makes it sound nasal. When your music sounds like that, it's a sign that your amplifier is not being able to draw enough power. Note that this is different from the clipping that happens if the transistors in the output stage enter the saturation regime; in that case, instead of a smooth scrunched sine wave, we get a flat-volume squared wave, which makes everything sound like a heavy metal guitar.**

Ever wonder why 100W audiophile amplifiers have external power supplies that look bigger than the 1000W power supply on a computer server? That's because they are. Abundant power is an essential part of clean amplification, and without clean amplification the rest doesn't matter. And the way you get abundant power is you have a lot of slack available.

Care to bet how much slack power those earbuds have?



Does it matter?

To whom?

To me, no. I have a number of other, better sources of music, and I use the iPhone as an internet device and, astonishingly, as a phone. Weird, I know.

To those who just want to listen to podcasts, audiobooks, maybe some music in noisy environments? Of course not.

To an audiophile, who for some unexplained reason doesn't get a cheap lossless player like the Fiio X5? Yes, it matters, but this audiophile has the option to get the new Audioquest Dragonfly RED, with a tail adapter for the iPhone, so that's what s/he should do. Pair that with a nice pair of big cans like the Sennheiser 650s (in my opinion the best quality/price cans on the market), and you're set.

To an audiosnob who can't tell the difference between 866kbps Apple lossless and 32kbps mono MP3 but insists on having "the very best," preferably Bang & Olufsen or some other design-heavy, sound quality-light, high-reconition brand? Yes, it will matter a lot. (Audiosnobs have already invaded Head-Fi and other audiophile forums arguing against the iPhone 7 from their usual position, ignorance.)




-- -- -- -- Footnotes -- -- -- --

* Yes, the Nyquist limit for 44.1 kHz sampling is 22.05 kHz... as long as the anti-aliasing filter is a perfect step function in the frequency domain. The universe containing exactly zero perfect step function anti-aliasing filters, I and the entire engineering profession prefer to hedge by saying that it's "over 20 kHz."

When audiophiles say that LPs (Long Play records,  aka "vinyl," Olivia Wilde not included) have better sound than CDs, they are usually referring to dynamic range. It's not just that CDs have only 96dB of range, but much worse, that in transferring the music from the master recordings to CD, sound butchers engineers would monkey about with the original dynamic ranges to "make it fit better," which was disastrous for music with broad dynamic ranges.

(The standard example is the butchery of Dire Straits' "Money For Nothing," which was so compressed for the CD that it lost the whole point of the intro. Hey, though I listen almost exclusively to art music and jazz, nostalgia has its place.)


** That's because the sound effect that makes electric guitars sound like that is precisely pre-amping the sound so high that the output stage transistors will saturate and clip the waveform square, at the same time removing almost all volume envelope effects. You can do this to any instrument including voice.

Added later: yes, I know all these effects are digital now. Kids these days! In my day you built your effects with transistors, µA741s and sometimes NE555s. None of that "digitize, FFT, do whatever, convert out" nonsense. We had grit!

Sunday, September 11, 2016

A math-themed Saturday

Friday was a heavy deadlift day; consequently, Saturday was a "let me lie in this recliner in peace (with my posterior chain in pieces)" day. Immobility is better spent with some recreational geekery, and not being a gamer I prefer math, science, or engineering. This was yesterday's; all math, by coincidence.


Movie: The Man Who Knew Infinity


I had read the book and I know some of Ramanujan's results as part of being a math-curious person, but I hadn't seen the movie. Now thanks to the sneakernet – of legally borrowed BluRay discs, not illegal copies on flash drives – I have.

There's not much math in the movie, but it's a good movie.*

Bonus: Numberphile on Ramanujan –




Book: Weapons of Math Destruction, by Cathy O'Neil


String theorist Lubos Motl doesn't like it. I think he's a bit harsh, but mostly correct. There are two problems with the book, IMNSHO:

1. An attribution problem: data-based decision-support systems (DB-DSSs) are blamed for problems that come from their use. That blame correctly belongs to the people making the decisions, either the users of the DB-DSSs or the people who choose/design/program those DB-DSSs. Most of her examples are of people in power hiding behind "math" to further an agenda, just like in other times people in power used "the divine right of kings" to further their agenda.

2. An inconvenient realities problem: there are some real heterogeneities that O'Neil doesn't like, but have real-world consequences (say: young men are more likely than other demographics to die doing something stupid; their life insurance is concomitantly more expensive). O'Neil appears to want at least some those realities hidden; DB-DSSs find them, therefore she wants them regulated/limited/overruled. I say: if they exist, find them, understand them, address them (for example, take mitigating actions to avoid their exploitation).

It's important to note that the choice we have is not between an ideal world and flawed DB-DSSs; it's between flawed big data, very flawed small data, and zero-data preconceived notions. So, in general, the more data the better:
For example: I'm a Portuguese male; knowing only that, most people would assume I like soccer (I don't). Netflix, YouTube, and Amazon, having precise data about my revealed preferences, never suggest anything soccer-related, though they will suggest a lot of exercise-related content/purchases.
More data leads to more personalized recommendations and away from stereotypes. But...

But there are always outliers. People who are outside the predictive model; people whose data is incorrect; people who have changed significantly over time (most people don't change, but some do); people for whom there's no data; and, as the excerpt above illustrates, pretty much all metrics can be gamed and many are.

So, there's a need for balance, accountability, and clarity. In that, O'Neil is right, but that's not a DB-DSSs problem, it's a problem with every human system. If anything, DB-DSSs have the potential for improving all three. (Balance may appear strange at first, but there's a whole field of multi-criteria decision-making that studies the comparative statics of varying evaluation criteria.)

I might write something longer about this, as there's an increasing number of anti-DB-DSSs books, posts, and videos.

Bonus: here's O'Neil presenting some of the examples from the book –




Paper: "Why does deep and cheap learning work so well?" by Henry Lin and Max Tegmark

http://arxiv.org/abs/1608.08225

Via The Tech Review, I found this arxiv paper explaining the connection between deep learning and the structure of the universe. (Yes, it's a real technical paper. When I say I like math, it's not like the people who "love science," but only if they don't need to learn any.)

Abstract (JCS comments in italic blue):
We show how the success of deep learning depends not only on mathematics but also on physics: although well-known mathematical theorems guarantee that neural networks can approximate arbitrary functions well, the class of functions of practical interest can be approximated through "cheap learning" with exponentially fewer parameters than generic ones, because they have simplifying properties tracing back to the laws of physics. 
In other words: the reasons these deep learning systems work when math would suggest they shouldn't is that the physical world is a lot more organized than arbitrary math spaces. 
The exceptional simplicity of physics-based functions hinges on properties such as symmetry, locality, compositionality and polynomial log-probability, and we explore how these properties translate into exceptionally simple neural networks approximating both natural phenomena such as images and abstract representations thereof such as drawings. 
In other words: as long as what we're learning is nicely behaved (defined as "symmetry, locality, compositionality and polynomial log-probability"), this trick for massive information-compression will work. 
We further argue that when the statistical process generating the data is of a certain hierarchical form prevalent in physics and machine-learning, a deep neural network can be more efficient than a shallow one. We formalize these claims using information theory and discuss the relation to renormalization group procedures. Various "no-flattening theorems" show when these efficient deep networks cannot be accurately approximated by shallow ones without efficiency loss - even for linear networks. 
In other words: You really need the "deep" in deep learning.

Bonus: An introduction to deep learning, by Andrew Ng –




Video: A Hole in a Hole in a Hole by Numberphile

Ah, the crazy world of topology...


There's some extra footage in Numberphile's second channel:



-- -- -- -- -- -- --

* For people who say that it's impossible to make a compelling movie with complicated technical content in it, I recommend watching the movie Copenhagen, where several elements of quantum mechanics are interweaved with the story.

Thursday, September 8, 2016

Multitasking at the gym



Powerlifting and other training (including conditioning) are not multi-taskable. It's very important to keep one's concentration and focus on the exercise. I cringe when I see people talking with each other while moving metal. Even during warm-up sets; perhaps especially during warm-up sets, when the low weight allows one to do a preflighting of the movement, check for any anomalies in mobility or weak or sore prime movers or stabilizers.

When walking short distances, cooking, or doing housework, I tend to listen to podcasts or sometimes to the audiotrack of YouTube hangouts (basically the equivalent of radio's Morning Zoo). These are ways to get some low-density information into the brainpan without distracting too much from the errands. (I also listen to podcasts on shared transportation, like shuttles. Too much entropy for anything else.)

Some podcasts I listen to (there are more; I usually only listen to a few episodes a week):


(Yes, I have a significant déformation professionelle.)

When I go for a real walk, what I call a walk-n-think, I typically listen to music, not any sources of information. The point is to think and clear the cobwebs of my mind. I find the Baroque a particularly good cobweb-solvent period. Here's a walk-n-think with a side-trip to exchange books at the SFPL:

Walk SF January 30, 2016

Once in a repetitive-motion machine in the gym, for oxygenation not conditioning purposes, the main determinant of the type of content is the movement of the head, in particular the eyes.

When walking on treadmills (my preferred cool-down approach) or rowing on a machine (which for me is real exercise, but of form and rhythm, not muscle), the head moves too much to fix the eyes on a screen; as the activity itself requires less attention than the errands, freeing attention for content, my choices of media are audio lectures and audio books.

(I only run on treadmills for High-Intensity Interval Training, which is conditioning, which means it cannot be multitasked. When doing anything that stresses the body, I always want 100% of the attention to be on the exercise. I have this strange desire to avoid injury, ridicule, and absence of gains; sort of the philosophical opposite of CrossFit.)

I should clarify that I'm using "lecture" to mean all sorts of purposeful speeches, not just university lectures. I do have a number of these speeches and lectures which work out well, many of them extracted from videos of talks where there were no significant visuals (or the visuals were the dreaded "power points," which are speaker's notes not audience-centered visuals).

As for audiobooks, I've been a Platinum member of Audible for fifteen years, which means I have two new books per month, which I complement by filling up on the seasonal sales and the occasional extra purchase.

Here are a few of my latest Audible purchases:



 On average I listen to around 30 audiobooks per year, some of which are re-listens.

(Yes, I re-read and re-listen to books. There are some books I read pretty much every year… Waugh, Wilde, and Wodehouse; certain Poirots and Maigrets; a few favorite Discworld pieces. There are 1000-page books I read every year, though that's just Anathem. And Cryptonomicon. And Reamde. And Seveneves, now on its second year. Guess who my favorite living author is.)

For other machines, like elliptical runners, stairclimbers, and exercise bicycles, the head doesn't move, so it's feasible to use the eyes. My old-but-trusted iPad 1.0 has seen this gym duty pretty much from the first day I bought it, which was the day it came out. (100% impulse purchase, as I was coming back from brunch and passed an Apple Store.)

Though in the past I've read books (paper books), journals (academic magazines), and magazines on paper on these machines, and have evolved to read electronic versions of these, I find that I prefer to give the eyes a break by letting them watch video instead of processing written words. I tend to watch lectures (again including speeches, but in this case a lot more real lectures) on the elliptical and the stairclimber, and to read books (ebooks with large type) only on the exercycle.

(Basically I use elliptical, exercycle, and stairclimbers in my building exercise room. It's not a "gym event," rather a "I need to take a break and instead of vegetating in front of the TV, which I no longer have service for, I can go do some movement while imbibing some basic knowledge.)

I keep a Rite-in-the-rain notebook and a Fisher space pen nearby in case I want to make notes, something that confuses other users of our exercise room. And that others have started to copy.

I hasten to point out that despite the déformation professionelle mentioned above, I tend to think of these books and lectures as leisure, so I keep them broadly within my areas of interest but not focussed on my actual area of work. For example, here are a few courses that I've enjoyed on the elliptical machines in the exercise room:









It's worth mentioning that real intellectual work cannot be multitasked, as indicated by the position of textbooks and research papers in the diagram. Anytime I'm looking to learn something, that requires dedicated attention, note-taking, and a block of dedicated time.

I don't mean work-related textbooks (though american textbook prices do their darndest to discourage the intellectually curious from serious study) or research papers (ditto with the gating, but public libraries and authors' own webpages are a good workaround), but even when I'm trying to learn something, say geology, our of pure curiosity, reading textbooks and research papers has been a much better experience than the materials that now pass for science popularization.

(My opinion on the decline of science popularization is well established in this blog.)

One thing I used to do at the gym (the real big gym, not the exercise room and not the powerlifting gym I occasionally go to instead of driving to the big gym) and eventually stopped due to social pressure, was to watch FoodTV network on the gym TV while cooling down on a treadmill or an elliptical, after 90-120 minutes of iron and conditioning. For some reason, those whose entire workout is 30 min of slow walking on the elliptical (what I call a Potemkin workout, still better than Planet Fatness or CrossFit) were not happy with my selection of programming.

Go figure.

Tuesday, August 30, 2016

Some thoughts on quant interviews

Being a curmudgeonly quant, I started reacting to people who "love" science and math with simple Post-It questions like this:


(This is not a gotcha question, all you need is to apply Pythagorean theorem twice. I even picked numbers that work out well. Yes, $9 \sqrt{2}$ is a number that works out well.)

Which reminds me of quant interviews and their shortcomings.

I already wrote about what I think is the most important problem in quantitative thinking for the general public, in Innumeracy, Acalculia, or Numerophobia, which was inspired by this Sprezzaturian's post (Sprezzaturian was writing about quant interviews).


In search of quants

That was for the general public. This post is specifically about interviewing to determine quality of quantitative thinking. Which is more than just mathematical and statistical knowledge.

One way to test mathematical knowledge is to ask the same type of questions one gets in an exam, such as:

$\qquad$ Compute $\frac{\partial }{\partial x} \frac{\partial }{\partial y} \frac{2 \sin(x) - 3 \sin(y)}{\sin(x)\sin(y)}$.

Having interacted with self-appointed "analytics experts" who had trouble with basic calculus (sometimes even basic algebra), this kind of test sounds very appealing at first. But its focus in on the wrong side of the skill set.

Physicist Eric Mazur has the best example of the disconnect between being able to answer a technical question and understanding the material:

TL; DR: students can't apply Newton's third law of motion (for every action there's an equal and opposite reaction) to a simple problem (car collision), though they can all recite that selfsame third law. I wrote a post about this before.

Testing what matters

Knowledge tests should at the very least be complemented with (if not superseded by) "facility with quantitative thinking"-type questions. For example, let's say Bob is interviewing for a job and is given the following graph (and formula):

Nina, the interviewer, asks Bob to explain what the formula means and to grok the parameters.

Bob Who Recites Knowledge will say something like "it's a sine with argument $2 \pi \rho x$ multiplied by an exponential of $- \kappa x$; if you give me the data points I can use Excel Solver to fit a model to get estimates of $\rho$ and $\kappa$."

Bob Who Understands will start by calling the graph what it is: a dampened oscillation over $x$. Treating $x$ as time for exposition purposes, that makes $\rho$ a frequency in Hertz and $\kappa$ the dampening factor.

Next, Bob Who Understands says that there appear to be 5 1/4 cycles between 0 and 1, so $\hat \rho = 5.25$. Estimating $\kappa$ is a little harder, but since the first 3/4 cycle maps to an amplitude of $-0.75$, all we need is to solve two equations, first translating 3/4 cycle to the $x$ scale,

$\qquad$ $ 10.5 \,  \pi x = 1.5 \,  \pi$ or  $x= 0.14$

and then computing a dampening of $0.75$ at that point, since $\sin(3/2 \, \pi) = - 1$,

$\qquad$  $\exp(-\hat\kappa \times 0.14) = 0.75$, or $\hat \kappa = - \log(0.75)/0.14 = 2.3$

Bob Who Understands then says, "of course, these are only approximations; given the data points I can quickly fit a model in #rstats that gets better estimates, plus quality measures of those estimates."

(Nerd note: If instead of $e^{-\kappa x}$ the dampening had been $2^{-\kappa x}$, then $1/\kappa$ would be the half-life of the process; but the numbers aren't as clean with base $e$.)

This facility with approximate reasoning (and use of #rstats :-) signal something important about Bob Who Understands: he understands what the numbers mean in terms of their effects on the function; he groks the function.

Nina hires Bob Who Understands. Bonuses galore follow.

Bob Who Recites Knowledge joins a government agency, funding research based on "objective, quantitative" metrics, where he excels at memorizing the 264,482 pages of regulation defining rules for awarding grants.