Writing

Archive | Tags

How are Stanford's STARR database,the OMOP Common Data Model, and Epic's EHR Related?

December 1, 2022  •  omop | ehrs | stanford | epic | health IT
Overview of Epic, OMOP CDM, and Stanford's STARR database, and how can to use them for EHR research

In this post, I explain what Epic (Chronicles, Clarity, etc.), OMOP CDM, and Stanford STARR are, how they are all related, and what the benefits/features of each are.

How to connect VSCode to your remote server via SSH

October 21, 2022  •  devops | vscode | ssh
Plus debugging tips for common error messages and pitfalls.

If you’ve ever had to ssh into a server to run programs, you may be taking an unnecessary productivity hit each time you relegate yourself to coding in a Jupyter notebook on localhost:8000.

Plotting the Distribution of MLB Batting Statistics Over Time

September 13, 2022  •  baseball | visualization | matplotlib
What is a "good" batting average? An "average" OBP? An "elite" SLG?

Most people know that a batting average over .300 is the mark of a great hitter, and that hitting .200 will land you on the bench.

How to Publish Jupyter .ipynb Notebooks to a Jekyll Static Blog

September 13, 2022  •  jupyter | devops | blogging
How to convert .ipynb to .md files and publish them on your static blog (with images, SVGs, etc.)

Goal: Publish a .ipynb on my Jekyll static site as painlessly as possible.

Matplotlib Tips + Tricks

August 13, 2022  •  python | visualization | matplotlib
Lesser-known matplotlib tricks for frequent users of the Python library

These are all taken from this great 3-hr YouTube tutorial by Ben Root from SciPy 2018. I’ve condensed the main takeaways of the talk into the following list of key concepts / tricks that I hadn’t previously been aware of.

Combining ROC Curves with Indifference Curves to Measure an ML Model's Utility

August 10, 2022  •  AI | model evaluation | statistics

For the below examples, assume we have a binary classification task where the class label is $y \in {0, 1}$, and the model’s predictions are $\hat{y} \in {0, 1}$

Sensitivity + Specificity + PPV + TP/FP/TN/FN Formulas

August 2, 2022  •  statistics | AI | models | probability
Formulas for sensitivity, specificity, PPV, NPV, TPR, FPR, prevalence, etc.

A brief cheat sheet / reference guide containing the definitions, formulas, and explanations of the most commonly used model evaluation metrics for binary classification tasks.

What is Tidy Data?

January 25, 2022  •  data | databases
The simplest definition of tidy data.

The definition of “tidy data” is simple:

Notes on Vim Tutor

January 3, 2022  •  vim | cheatsheet | terminal
Cheatsheet of keyboard shortcuts for Vim, learned from vimtutor.

I’ve always been intimidated by vim. After taking MIT’s Missing Semester Course (a free online course that I’d highly recommend), I learned about a built-in utility called vimtutor that automatically comes installed with vim.

Tips to Free Up Mac Storage

December 24, 2021  •  Mac | memory
How I tripled the available storage on my Mac by deleting useless files.

tl;dr – If you have XCode installed, then you could easily be wasting up to 10% of your disk space.

Malcolm Gladwell v. Chess

October 28, 2021  •  Malcolm Gladwell | chess | LSAT
What can competitive chess teach us about "fast" v. "slow" thinking skills?

In Season 4, Episode 3 of his podcast Revisionist History, Malcolm Gladwell argues against the use of the LSAT in law school admissions. A full transcript of the episode can be read here.