RDD stands for Resilient Distributed Dataset, which are fault-tolerant collections of elements that can be operated on in parallel in Spark. There are two ways to create RDDs: by parallelizing an existing collection or referencing a dataset in an external storage system. RDD operations include transformations, which take an RDD as input and output one or more RDDs lazily without changing the original, and actions, which trigger computation. Transformations are either narrow, originating from a single partition, or wide, taking data from multiple partitions requiring a shuffle.