By Eleanor Bennett
6 min read
For our latest expert interview on our blog, we’ve welcomed the Director of Data Science and Machine Learning at Included, Michael Chang. Michael helps measure and optimize workforce diversity and inclusion efforts through data.
Prior to Included, Michael worked in various data capacities at Facebook, Teach for America, Interactive Corp, and eBay. Michael also enjoys teaching and is an adjunct instructor for data science at UCLA Extension and Harvard FAS.
Tell us about the business you represent, what is their vision & goals?
Thanks for having me! I represent Included.AI. Included is a Diversity Equity Inclusion (DEI) data platform that provides insights, predictive analytics, and real-time dashboards to business leaders and hiring managers who are looking to make more informed hiring & recruiting decisions and take proactive inclusive and equitable actions when it comes to their people.
Our mission is to give every company clear and quality DEI data insights to guide equitable people efforts that will improve culture, lower hiring costs, and improve overall retention. Studies show diverse teams lead to better outcomes but diversity and inclusion doesn’t happen by accident. Making sure communities of all backgrounds feel included and have a sense of belonging takes intentionality, infrastructure and operational support. Included is here to help companies achieve that.
Can you share a little bit about yourself and how you got into the field of artificial intelligence?
I started my professional analytics career back at AskJeeves, where I was helping with research studies, asking why questions, like why are you searching about mountain lions at 11 pm. In the end, it doesn’t really matter what data it is, inevitably, you start with descriptive statistics. What happened, where did it happen, when did it happen, so on.
Once you have the basic descriptive patterns of a problem, the next step is understanding why it happened, or its predictors. For me, that meant getting into experiment design, data logging, lots and lots of A/B testing. You’re trying to explain things, either by identifying causal effects or getting more refined in your data collection. I ran so many A/B tests at IAC and eBay, just trying to figure out what made a user do x instead of y.
Then, as you have some understanding of the why, you may want to predict with precision, or you may need to scale your why’s rapidly across a wide set of problems. Here, you start getting into predictive algorithms, or AI, be it for forecasting, predictions or causal inference. For me, this meant more school (I got my graduate degree) and continued my personal and professional data study/practice, which I then got to apply at Teach for America and Facebook for communities, developers and education, where I supported key partnerships and new products. I did this for a few years before joining Included, where data/AI is at the heart of what we do. My journey is still happening! I look forward to what comes next.
What does your day to day responsibilities look like at your organization?
It’s a really wide bag at an early-stage startup. As you want to have a process and routine to build up your velocity, no two days really are the same as much. I would say a third of my days are spent in exploratory, investigatory research. This could mean reading papers, compiling studies, and combing through our own customer data. I’m trying to find interesting customer patterns and metric conditions that are conducive to product development and addressing market gaps.
The next third is a mix of building and validation. As a new company with new products in a new-ish field, there is no mold so to speak so it may take a little more time to get it right. It’s important to build intentionally but also to reinspect and validate things, especially when your data and analysis may impact someone’s career.
The last third is probably more operational. As everything is new, we sometimes have to keep a closer eye on it at first. This means doing extra QA, tuning models, rerunning analysis as new customers or data dimensions come online, checking in on things cross-functionally. Then, of course, I also have my regular daily administrative things, planning, 1:1s, etc.
What are the key differences between computer science, machine learning and AI?
I would say that there is some distinction between computer science and ML/AI. Mainly that computer science is a broader discipline and framework than ML/AI and is related to software development in general while ML and AI are more data-specific disciplines, although ML and AI do get used quite interchangeably.
As an example, at Included, computer science can include the data processing, searching, indexing and formatting of our data in analytics dashboards and insights so that DEI leaders can see their data in clear and insightful visualizations. ML/AI would be more about the predictions, inferences and pattern detection within our data, so predicting how an interview or screening will go for a historically marginalized candidate in a particular department.
What are some misconceptions that you believe the average person has about AI?
I would make two points: The first misconception is that AI is on the cusp of doing it all. Artificial general intelligence is still a long way away. Today, you still need proper subject matter expertise to even devise a narrow application of AI. At Included, we have trained our systems to work in the HR and DEI context. If you think about the nuances that come with recruiting (compensation, sources, job posting details, interview process), there’s so much context that needs to be captured and translated to the model, which will then hopefully add additional insights and context humans may have missed.
The second point, which is related to the first, is that AI is utterly objective. We’ve all heard of the term garbage in, garbage out. Well, if a person has implicit bias, then the systems that person has designed will be collecting data in a biased way, and then the data being fed to your ML/AI models will also be biased, which will, surprise surprise, produce biased algorithms and outputs. This has enormous implications for workforce hiring, criminal justice, credit scores, any kind of application. Any AI system that can impact someone’s health or career really needs to consider things within a DEI framework from the start. At Included, we try to think about these things at the data labelling and assessment level, not just the output level. It’s important we get a candidate’s background and job skills right before we do some complex modelling on top of it.
What advice would you give to someone wishing to start their career in artificial intelligence?
The world is getting more and more data-rich. Data applications, and therefore AI applications, will only grow in scale and popularity as will traditionally more data-sparse industries. So first, find a mission you are passionate about, maybe it is climate change, criminal justice reform, economic inequality, or corporate diversity, equity and inclusion.
Then, once you’ve found a niche, immerse yourself in the data, but start small and proceed iteratively. You don’t need a multi-layer neural network for every data science problem and don’t jump into the hidden box algorithms without understanding the context and problem you are trying to solve.
Would you like to share any artificial intelligence forecasts or predictions of your own with our readers?
I think NLP as the focus area of AI will experience a similar breakthrough to the way computer vision has in the last few years. I suppose audio would be the next frontier after NLP.
I also both hope and expect the ethics model and governance of AI to scale and standardize in a way that will be meaningful. The industry needs it and it will ultimately provide better value to users and companies. In the social good sector, which sometimes struggles to scale due to its big tent objectives, this will be even more important.
What is your experience with using AI-backed data analysis or log management tools? What do you think is the benefit of using a log management tool that has machine learning capabilities for an organization?
I’m a big believer in using processes to support operations. At Microsoft, it was only with having the right testing tools that the habit and process of writing test cases became a best practice.
For AI and DEI, it’s the same. You need the right tools. Today, MLops is a big component of implementing and scaling AI and is the vanguard of getting extra value after that initial bump. Without it, you’re basically holding things together with scotch tape and won’t be able to scale or learn as fast as a team. Same for diversity, if you don’t have the right tools and infrastructure to promote equitably or interview equitably, diversity is just HR pillow talk.
If you enjoyed this article then why not check out our latest guide on what is Kaggle or our article covering everything you need to know about Prometheus?