Feedback for alter-level dataset in R

Hi all,
I’m working with Prof. Mattia Vacchiano on The Empty Office. We’ve used Network Canvas to collect interview data and I’m now building an alter-level dataset based on these data. This is relatively new to me, so I wanted to check in here to see whether I’m on the right track before continuing.
What I have now is a dataset composed of alters with their characteristics, their egos’ characteristics, and their alter-alter ties. I’ve opted to include alter ties as binary columns in the dataset.
Below is the R code I’ve written. I’d appreciate your thoughts on the approach (especially wrt. handling aaties), anything to be improved, or blind spots I may have.
Thank you.

alter_data ← nc_data$alters
ego_data ← nc_data$egos
colnames(alter_data)[colnames(alter_data) == ‘X’] ← ‘alterX’
colnames(ego_data)[colnames(ego_data) == ‘X’] ← ‘egoX’

alter_level_noaa ← merge(alter_data, ego_data, by.x = “networkCanvasEgoUUID”, by.y = “networkCanvasEgoUUID”)

alter_edgelists ← nc_data$alter_edgelists

alter_level_data ← alter_level_noaa

all_alters ← alter_level_noaa$networkCanvasUUID

for (i in all_alters) {
alter_level_data[[i]] ← NA
}

unique_egos ← unique(alter_level_noaa$networkCanvasEgoUUID)

for (ego in unique_egos) {
alters_by_ego ← subset(alter_level_noaa, networkCanvasEgoUUID == ego)
altedges_by_ego ← subset(alter_edgelists, networkCanvasEgoUUID == ego)
for (i in 1:nrow(altedges_by_ego)) {
source_id ← altedges_by_ego$networkCanvasSourceUUID[i]
target_id ← altedges_by_ego$networkCanvasTargetUUID[i]
alter_level_data[alter_level_data$networkCanvasUUID == source_id, target_id] ← TRUE
}

sources_this_ego ← alters_by_ego$networkCanvasUUID
targets_this_ego ← alters_by_ego$networkCanvasUUID

for (s in sources_this_ego) {
for (t in targets_this_ego) {
if (is.na(alter_level_data[alter_level_data$networkCanvasUUID == s, t])) {
alter_level_data[alter_level_data$networkCanvasUUID == s, t] ← FALSE
}
}
}
}

Hi, there! Any chance you could post some screenshots illustrating how your data structure is changing alongside the code? That would help greatly in seeing if we can provide any recommendations.

Thanks for the response. Here you can see the changes starting from loading NC data using ideanet, then creating an alter-level dataset where each alter is a row and egos is a variable, then adding alter-alter ties as binary rows (NA for alters belonging to different egos).

Looking at the screenshot, it appears that you’ve succeeded in what you’ve set out to do. But you may want to ask yourself what benefits you hope to gain from formatting your data in this way before proceeding. Right now your alter-level dataset contains an abundance of rows whose values are predominantly NA values (as seen in your screenshot). This might get unwieldy. Wide-format datasets for ego networks typically have columns labeled alter1, alter2, alter3, … , whose values then contain the ID numbers of specific alters. If you’re committed to having a wide dataset, you may want to consider that format. As the dataset currently stands, you’ll have a lot of columns named after Network Canvas UUIDs. These columns seem like they would be hard to select during analysis.

When you take an alter-alter edgelist and convert it to a wide format, you’re effectively turning the edgelist into an adjacency matrix. You may want to consider creating a list object in R that stores each ego network’s “adjacency matrix” as its own data frame. Assuming the first column in each data frame would contain alter IDs, you could then merge each adjacency matrix into the alter-level dataset as needed. Of course, this creates a more complicated data structure that isn’t easily saved into a single CSV file.

What are you hoping to gain from reshaping your Network Canvas data? Where to go from here probably depends on your needs.

1 Like