Managing the data: exporting raw data, connecting (different format) datasets

merlinleunda · March 23, 2023, 5:43pm

Hello,

I´m discovering both Social Network Analysis and Network Canvas, and I have some doubts with the data type, and the data exportation. I did some tests with the demonstration protocol and I exported the data (without using the server option).

If I understand well, for each mapped relationship type, we will have one csv…

Concerning the Edge attribute list, I read in the guide that “Each row will have a key to link to ego (networkCanvasEgoUUID ), as well as source and target columns that reference both the UUID and the automatically incrementing export ID ”, what I suppose to be the codes in each cell of the edgeList csv (eg. 9cf1576d-4cd6-4f5c-a516-5582aa1509f3). What are these (networkCanvasEgoUUID, networkCanvasUUID, networkCanvasSourceUUID, networkCanvasTargetUUID) codes? What are the “automatically incrementing ID (that is only consistent on a per-export basis)”? How can I build a matrix with this data? What about connecting this data with the ego attributes? What about building graphs?

I have the same interrogations concerning the Alter attribute list…

Also, I would like to “connect” the data that I will obtain with Network Canvas with another dataset (Excel organized totally differently) that is mapping other relationships and attributes with (most of) the same actors. Is this going to be a problem? I´m scared about the difficulties of linking datasets obtained with 2 different apps, considering that the attributes (and some networks) will be collected with Network Canvas while other (spatialized) relationships between actors must be collected with QField. In this case, the raw data will be a CSV with a list of names in each cell (where the raws will be the different actor’s areas, and the columns will be the network and attribute variables).

Thank you in advance for your help!

Kind regards,
Merlin

Michelle · March 24, 2023, 8:39pm

Hi Merlin,

Please take a look at our tutorial for working with Network Canvas data. I believe it will answer many of your questions.

https://documentation.networkcanvas.com/tutorials/working-with-data/

All the best!
Michelle

merlinleunda · March 27, 2023, 3:09pm

Hi Michelle,

Thank you very much for your quick answer. The tutorial is super interesting and well explained. However, I´m not seeing anything about the possibility to join / concatenate the dataset obtained with Network canvas with another dataset (the raw network data will be a lis of names in each cell of the excel/CSV doc).

Any idea about how I might do this?

Thank you in advance for your help!

Best regards,
Merlin

Michelle · March 29, 2023, 2:48pm

Hi Merlin,

More than likely there will be a mismatch between the names indicated by your study participants and the names listed within your CSV file. Therefore, you will likely need to do some manual cleaning/matching of the data before you’re able to join the files. Depending how large/messy your dataset is, you may also wish to look into utilizing a record linkage algorithm to help streamline the process. There are ones out there specifically for names. I am not up to date though on what is out there. Perhaps others on the board might have some ideas?

Hope that is a help!

Best,
Michelle

merlinleunda · April 3, 2023, 2:51pm

Dear Michelle,

Thank you again for your reply and the suggestions!

As I´m doing a whole network study, I hope that the names will match without too much cleaning… To limit this problem, I´m planning to work with “Name generator for roster data” for each relationship I would like to map. By the way, I noticed that if one name is called/selected by the interviewed participant once, than this name is not appearing in the name list in the other questions of the survey. How could I fix this? I would like to generate the same lists of name to map each relationship of interest.

My other doubt is about concatenating a dataset that is just a list of names for each mapped relationship (one cell = one list of name) with the datasets obtained with Network Canvas, as I´m not understanding what are these (networkCanvasEgoUUID, networkCanvasUUID, networkCanvasSourceUUID, networkCanvasTargetUUID) codes and the “automatically incrementing ID (that is only consistent on a per-export basis)”? Maybe this will not be a problem in the case that the NetworkCanvas obtained datasets are transformed in matrices in CSV?

Really sorry to disturb and thank you in advance for the help!

Best regards,
Merlin

Joshua · April 12, 2023, 12:34pm

I think you got your answer in your other thread, but just incase anyone else comes across this post, this is “by design”. In Network Canvas, the mental model we use for the roster functionality is that you are adding nodes to the interview network, which you then subsequently can add further attributes to elsewhere in the interview. When a node is nominated on a roster interface, it is added to the interview network, and therefore does not show up to be added again in any subsequent roster interfaces.

There is one exception to this, which is when you use the name generator interface using forms, or the name generator interface with quick add functionality. These interfaces allow multiple “side panels” to be displayed to the participant, where one can be a roster, and the other can be the current nodes in the interview network.

If you take the time to explore the data you get from your study, I think the purpose of these columns will become clear.

To summarize, UUID variables (short for “universal unique identifier”) that start with networkCanvas[...] are generated by the software to link to columns in other CSV files. For example, the value of networkCanvasSourceUUID in an edge list file corresponds to the networkCanvasUUID column in a node list file. Similarly networkCanvasEgoUUID points to the ego defined in the ego attributes file (this is primarily useful when you merge multiple interviews together, and therefore have multiple egos).

Hope this helps!