Hi, I have a dataset with the below sample data with structure stored in a case class
case class CustomerDocument(
customerId: String,
forename: String,
surname: String,
//Accounts for this customer
accounts: Seq[AccountData],
//Addresses for this customer
address: Seq[AddressData]
)
|customerId|forename |surname|accounts |address | |IND0001 |Christopher|Black |[]|[[ADR360,IND0001,762, East 14th Street, New York, United States of America,762, East 14th Street, New York, United States of America]]|
schema as
root
|-- customerId: string (nullable = true)
|-- forename: string (nullable = true)
|-- surname: string (nullable = true)
|-- accounts: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- customerId: string (nullable = true)
| | |-- accountId: string (nullable = true)
| | |-- balance: long (nullable = true)
|-- address: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- addressId: string (nullable = true)
| | |-- customerId: string (nullable = true)
| | |-- address: string (nullable = true)
| | |-- number: integer (nullable = true)
| | |-- road: string (nullable = true)
| | |-- city: string (nullable = true)
| | |-- country: string (nullable = true)
I need to access the "country" value from array of struct and check for a particular country value (for eg. canada) and create a new column "isPresent" set as True if Canada is present and False if Canada is not present.
I am not sure how to get the index of the array struct from above data. Any help / guidance how to do this is appreciated. Thanks