Logo

Ndcharles Nweke

Data Science & Analytics | IT Operations

A data generalist with experience in IT operations, and everything business automations. My interest is in real-life applications of data to business and everyday problems.

Let's Connect
LinkedIn | Twitter | Blog | View my Resume

Building an AI-Powered Community Matching Engine


alx-connect-website-api

Before there was an architecture diagram, a matching algorithm, or a single API endpoint, there was a problem I saw even before anyone could notice it.

The ALX Nigeria community was growing, pretty fast. What started as a network of learners quickly became tens of thousands of graduates domiciled in every state of Nigeria, each on a different path, at a different stage, with different goals. And yet, they all had one thing in common: the need to connect with the right people.

Not just anyone, but people who understood their journey.

At such growth rate, meaningful connections became harder to facilitate. I thought of a solution that connected people. These early attempts relied on manual processes, spreadsheets, bulk emails, and meeting coordination, which proved the value of structured matching but quickly became unsustainable.

The challenges were consistent: mismatched pairings, no-shows, limited feedback, and increasing operational overhead with each cycle.

It became clear that this wasn’t just a coordination issue, it required a purpose-built system.

What we needed was an engine that could intelligently match people, automate the interaction flow, and scale without manual intervention.

This document outlines how that system, ALX Connect, was designed and built.

For more details on the backstory of this journey, check my blog article


Building ALX Connect

Early-career professionals often struggle to find the right guidance, mentors, and opportunities. ALX Connect solves this by enabling users to discover and connect with other members of the community based on skills, experience, and career interests.

ALX Connect is a networking and mentorship discovery platform built for the ALX community. It helps learners and alumni connect based on skills, experience, and career interests, creating a collaborative ecosystem where knowledge and opportunities are shared.

ALX Connect is unique through two key features: an automated matching system and an automated feedback system. The platform automatically matches community members based on shared skills, experience, and career interests. Furthermore, the feedback system is designed not only to improve the platform but also to help identify and highlight emerging mentors within the community who consistently offer valuable guidance.

By combining machine learning, data-driven recommendations, and automated feedback mechanism, ALX Connect helps community members:

Ultimately, ALX Connect transforms the ALX community into a living network of knowledge, support, and opportunity, helping talented individuals grow and succeed in the global workforce. And the best part, this requires little effort from the team.

The live project is here


Implementation Techniques

The implementation is in two versions: Cloud Run (full semantic model) and Render (lightweight TF-IDF).

Text Embedding

Sentence Transformers (all-MiniLM-L6-v2), a pretrained transformer model that generates dense semantic vectors, was chosen for its balance of speed and accuracy, and its 384-dim output aligns well with the target dimension. For the Render version, TF-IDF + TruncatedSVD was used in place of the sentence transfromer (Render has memory constraints preventing loading large transformer models). This combination converts text to sparse term-frequency vectors then reduces it to 384 dimensions. It captures keyword overlap but, unlike sentence transformers, lacks deep semantic understanding.

Similarity Index

HNSWlib (Hierarchical Navigable Small World): a dedicated approximate nearest neighbour index optimised for cosine similarity was used as it has a simple API and good performance for my use case. Sklearn was also implemented as a fallback if HNSWlib fails (this was implemented for Render). Slower but guarantees exact nearest neighbours.

Matching Strategies

The matching system uses a multi-stage similarity and clustering pipeline designed to maximize high-confidence matches while ensuring every profile is paired. The process progressively relaxes constraints to maintain match quality before falling back to forced pairing strategies.

The system progressively moves through four stages:

This layered approach ensures that high similarity matches are prioritized yet ensuring complete coverage for all profiles.

If an odd number of singletons remains, the final singleton is added to an existing pair forming a trio group. This guarantees no profile remains unmatched.

Check out the API backend here


System Flow

The process for generating profile embeddings and identifying similar profiles involves several distinct, sequential steps, ensuring robust similarity matching and detailed output reporting.

Screenshot showing sample output data from the ALX Connect matching algorithm.
Sample output data generated by the ALX Connect matching algorithm.

Memory Management

Throughout the pipeline, memory usage is logged using psutil. After the matching completes, gc.collect() is called explicitly to free temporary objects (e.g., large similarity matrices). This is important to ensure that memory size doesn’t outgrow available resources. The consequence is constant shutdown on memory-constrained platforms like Render or increased platform costs on auto-scale infra like Cloud Run.

Limitations and Challenges

A lot of trade-offs were made to ensure the model runs on the free-tier of Render and Cloud run at first.

Future Improvements

This was built for a maximum of 10k profiles and so far tested with about 1k. Scaling it beyond that would require further testing to ensure fit for purpose. You might also want to consider a few extra things during scale:

Conclusion

ALX Connect stands as proof that powerful community tools can emerge from understanding a problem deeply and applying technology thoughtfully. It’s a foundation designed not just to match profiles, but to scale the very human need for connection and mutual growth. The real measure of its success isn’t just in the lines of code or the choice of library, but in the sustained, meaningful interactions it enables within a vibrant community.

[View project on GitHub]


[Back to Portfolio]


[View my GitHub profile] | [Read the Blog]