Skip to main content
Unlisted page
This page is unlisted. Search engines will not index it, and only users having a direct link can access it.

Audit your workspace projects

So your team is just starting to really get going in Hex, or your team has been using Hex for a while and things are getting a bit unweildy. This guide provides an example for how to use our API to review projects in your Hex workspace and how well your team is applying the internal guidelines you've set for managing data project lifecycles.

Without clear guidelines on how to categorize and classify projects, it's not uncommon for a communal space like a Hex workspace to become cluttered with assets of inconsistent quality or repudiability. When things go unregulated too long, you'll see some symptoms start manifesting:

  • Content Proliferation and Clutter: The workspace can quickly become crowded with many projects, making it hard to find relevant or up-to-date analyses.
  • Inconsistent Standards: Projects may vary widely in quality, documentation, naming, and structure, making collaboration and reuse difficult.
  • Knowledge Silos: Important insights or methods may be buried in individual projects, reducing knowledge sharing and organizational learning.
  • Access and Security Risks: With many contributors, managing permissions and data access becomes more complex, increasing the risk of accidental data exposure.
  • Maintenance Challenges: Old or unused projects may linger, leading to confusion about which analyses are current or authoritative.

So what's the way out of this mess?

  • Define clear organziational rules about how individuals should be curating their analyses
  • Enable your team on those rules and why they matter!
  • Audit your workspace to see how folks are applying the process. See what's working, what isn't, and adapt accordingly.
  • Rinse and repeat as needed!

Example project review

Below we'll walk through a project that is designed to audit and report on the organization of projects within a Hex workspace, focusing on compliance with internal lifecycle rules. For the purpose of this example, we'll focus on conducting an audit to make sure that two important data hygeine rules are being followed:

  1. Projects with a "production"-level category applied to them MUST be in a Collection.
  2. Any Project shared with the entire Hex workspace must have a production-level category applied to them.

Here are the main steps:

  1. API Setup and Data Collection: For this project we'll rely on thehextoolkit to hit the listProjects API endpoint and return project metadata. This endpoint will return information like project titles, creators, views, categories, collections, creation/edit/publish dates, owners, and sharing settings. See more about this endpoint here.
import hextoolkit as htk
api_client = htk.get_api_client(listProjects_workspace_api_token)
  1. Data Processing: The raw project data is processed into a DataFrame. Additional columns are computed, such as days since creation, last edit, and last publish, as well as URLs for viewing and editing each project. Below is a code snippet to demonstrate how you can paginate through the listProjects response and build up the resulting dataframe.
first_page = api_client.list_projects(limit=100, include_sharing = True)
after = first_page.pagination.after
projects = first_page.values

while after:
response = api_client.list_projects(limit = 100, include_sharing = True, after=after)
after = response.pagination.after
new_projects = response.values
projects = [*projects, *new_projects]

project_data = []

for project in projects:
id = project.id
title = project.title
views_all = project.analytics.app_views.all_time
views_30d = project.analytics.app_views.last_thirty_days
views_14d = project.analytics.app_views.last_fourteen_days
views_7d = project.analytics.app_views.last_seven_days
categories = list(x.name for x in project.categories)
collections = list(x.collection.name for x in project.sharing.collections)
created_at = project.created_at
creator = project.creator.email
description = project.description
last_edited = project.last_edited_at
last_published = project.last_published_at
owner = project.owner.email
full_access_groups = list(x.group.name for x in project.sharing.groups if x.access == "FULL_ACCESS")
workspace_access = project.sharing.workspace.access
status = None if project.status is None else project.status.name
asset_type = None if project.type is None else project.type
project_data.append((id, title, views_all, views_30d, views_14d, views_7d, categories, collections, created_at, creator, description, last_edited, last_published, owner, full_access_groups, workspace_access, status, asset_type))

all_projects = pd.DataFrame(project_data, columns = ["id", "title", "views_all", "views_30d", "views_14d", "views_7d", "categories", "collections", "created_at", "creator", "description", "last_edited", "last_published", "owner", "full_access_groups", "workspace_access", "status", "type"])
  1. Rule-Based Audits: The project applies two main compliance rules:
    • Rule 1: Identifies projects with a "production" level status that are NOT included in a Collection.
    • Rule 2: Any Project shared with the entire Hex workspace must have a production-level status applied to them.

Below is an example code snippet that show how to implement the first rule.

# Rule 1
rule_1_name = "Rule 1 - 'Production'-level projects must be in a Collection."

production_statuses = [
'Blessed',
'Production',
'Approved']


filter_rule1 = (all_projects['collections'].apply(len).lt(1)) & (all_projects["status"].isin(production_statuses))
rule_1_noncompliant_projects = all_projects[filter_rule1][columns_to_display]
n_rule1 = rule_1_noncompliant_projects.shape[0]

To get a full version of this template project, click on the Download button bellow, which will give you a .yaml file of the project. You can then upload that .yaml to your own Hex workspace and start building off of it directly! The template project goes into full detail on how to call the listProjects endpoint and also include some extra examples for how to dynamically explore all the rich information that's returned about your workspace's projects. The two rules that we're assess as examples here are just that, examples. Depending on what's relevant or needed for you team you can always tweak or redefine the things that matter to your team most!

Download Template Project